Hi Erick,
Thanks for the response.
I understood the reason for the regex match not working.
The help that I am looking from this forum is as below.
1. All the example regex query are to match one term only, Is there a
way in Solr to match multiple term?
2. How can
Thanks Jack.
On 1/24/15, 3:57 PM, Jack Krupansky wrote:
Take a look at the RegexTransformer. Or,in some cases your may need to use
the raw ScriptTransformer.
See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler
-- Jack
Yes - I am using DIH and I am reading the info from an XML file using
the URL datasource, and I want to strip the cpe:/o and tokenize the data
by (:) during import so I can then search it as I've described. So, my
question is this:
Is there any built in logic via a transformer class that
How are you currently importing data?
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:42 PM, Carl Roberts carl.roberts.zap...@gmail.com
wrote:
Sorry if I was not clear. What I am asking is this:
How can I parse the data during import to tokenize it by (:) and strip the
cpe:/o?
On 1/24/15,
Or, maybe... he's using DIH and getting these values from an RDBMS database
query and now wants to index them in Solr. Who knows!
It might be simplest to transform the colons to spaces and use a normal
text field. Although you could use a custom text field type that used a
regex tokenizer which
You are using keywords here that seem to contradict with each other.
Or your use case is not clear.
Specifically, you are saying you are getting stuff from a (Solr?)
query. So, the results are now outside of Solr. Then you are asking
for help to strip stuff off it. Well, it's outside of Solr, do
The unzipped XML that I am reading looks like this:
nvd xmlns:scap-core=http://scap.nist.gov/schema/scap-core/0.1;
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
xmlns:patch=http://scap.nist.gov/schema/patch/0.1;
xmlns:vuln=http://scap.nist.gov/schema/vulnerability/0.4;
Dear Solr community,
I am diving into Solr recently and I need help in the following usage scenery.
I am working on a project for extract and search bibliographic metadata from
PDF files. Firstly, my PDF files are processed to extract bibliographic
metadata such as title, authors,
Hi,
How can I parse the data in a field that is returned from a query?
Basically,
I have a multi-valued field that contains values such as these that are
returned from a query:
cpe:/o:freebsd:freebsd:1.1.5.1,
cpe:/o:freebsd:freebsd:2.2.3,
Via this rss-data-config.xml file and a class that I wrote (attached) to
download and XML file from a ZIP URL:
dataConfig
dataSource type=ZIPURLDataSource connectionTimeout=15000
readTimeout=3/
document
entity name=cve-2002
pk=id
When I polled the various projects already using Solr at my organization, I
was greatly surprised that none of them were using Solr replication,
because they had talked about replicating the data.
But we are not Pinterest, and do not expect to be taking in changes one
post at a time (at least the
Hi Harish,
What happens when you purge deleted terms with
'solr/core/update?commit=trueexpungeDeletes=true'
ahmet
On Sunday, January 25, 2015 1:59 AM, harish singh harish.sing...@gmail.com
wrote:
Hi,
I am noticing a strange behavior with solr facet searching:
This is my facet query:
The main question then is whether the full
cpe:/o:freebsd:freebsd:2.2.5 string needs to be stored in Solr.
If the desire is to actually strip that prefix all together and never
see it in Solr document, then Jack's suggestion is spot on. If it is
to store as is but to index based on custom
Hi,
I am noticing a strange behavior with solr facet searching:
This is my facet query:
- params:
{
- facet: true,
- sort: startTimeISO desc,
- debugQuery: true,
- facet.mincount: 1,
- facet.sort: count,
- start: 0,
- q: requestType:(*login* or
You could use nested entities in DIH.
So, if you store - for example - path to the PDF in the database, you
could do a nested entity with TikaEntityProcessor to load the content.
Just make sure the field names do not conflict.
Regards,
Alex.
Sign up for my Solr resources newsletter at
Thanks Alex, indeed, the relative path to PDF document is stored in the
database. I will try to use your approach.
Regards,
Yusniel Hidalgo
- Mensaje original -
De: Alexandre Rafalovitch arafa...@gmail.com
Para: solr-user solr-user@lucene.apache.org
Enviados: Sábado, 24 de Enero 2015
Another thing to consider... If you only need custom stats for the current
result page then there is no need to keep stats for the full result set. In
this case you could perform your custom collapse and generate the stats
just for the current page. The ExpandComponent could be altered to do that
Take a look at the RegexTransformer. Or,in some cases your may need to use
the raw ScriptTransformer.
See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler
-- Jack Krupansky
On Sat, Jan 24, 2015 at 3:49 PM, Carl Roberts
You are probably running into
https://issues.apache.org/jira/browse/SOLR-6931
On Sat, Jan 24, 2015 at 12:09 AM, Mike Drob mad...@cloudera.com wrote:
I'm not sure what a reasonable workaround would be. Perhaps somebody else
can brainstorm and make a suggestion, sorry.
On Tue, Jan 20, 2015 at
If you make your field type string the regex may work as expected.
But as others said, splitting into separate fields is likely more flexible.
Erik
On Jan 23, 2015, at 23:58, Arumugam, Suresh suresh.arumu...@emc.com wrote:
Hi All,
We have indexed the documents to Solr not able
When I first read your post I thought this example had something to do with
pipe, but now I realize that ::PIPE:: is simply a symbolic
representation of what we software people call a pipe, namely the
vertical bar character used as a field separator. Usually, terms and tokens
are all of the same
Daniel Cukier [danic...@gmail.com] wrote:
The servers have around 4M documents and receive a constant
flow of queries. When the solr server starts, it works fine. But after some
time running, it starts to take longer respond to queries, and the server
I/O goes crazy to 100%. Look at the New
On 23 January 2015 at 22:52, Daniel Cukier danic...@gmail.com wrote:
I am running around eight solr servers (version 3.5) instances behind a
Load Balancer. All servers are identical and the LB is weighted by number
connections. The servers have around 4M documents and receive a constant
flow
What version of Solr are you using? What GC parameters are you using? Do
you have GC logs enabled? Look at full GC times in those logs and see
what's happening. This particular problem is usually because replicas
cannot accept the rate of updates and they fall back to recovery state. You
should
Sorry if I was not clear. What I am asking is this:
How can I parse the data during import to tokenize it by (:) and strip
the cpe:/o?
On 1/24/15, 3:28 PM, Alexandre Rafalovitch wrote:
You are using keywords here that seem to contradict with each other.
Or your use case is not clear.
25 matches
Mail list logo