You can specify the file name as the id by adding a TemplateTransformer on
the entity x and specifying ${f.file} as the template value in the id
field. For example:
dataSource type=FileDataSource /
document
entity name=f processor=FileListEntityProcessor
baseDir=F:\Work\Lucene\Solr\Solr
I think this may be the same bug as LUCENE-5289 which was fixed in 4.5.1.
Can you upgrade to 4.5.1 and see if that solves the problem?
On Fri, Jun 6, 2014 at 7:17 PM, Justin Sweeney justin.sweene...@gmail.com
wrote:
Hi,
An application I am working on indexes documents to a Solr index. This
Thanks, it is working fine but I had to change the following line
field name=id template=${f.file} /
to
field column=id template=${f.file} /
On Mon, Jun 9, 2014 at 9:29 AM, Shalin Shekhar Mangar [via Lucene]
ml-node+s472066n4140715...@n3.nabble.com wrote:
You can specify the file name as
Hello all,
I was wondering what does the onlyMorePopular option for spellchecking use
as its threshold? Will it always pick the suggestion that returns the most
queries or does it base its result based off of some threshold that can be
configured?
Thanks!
Ali.
--
View this message in
I'm really at dead point.
Mine indeks is 5,6GM and about 8mln documments.
Field i'm using for filter is simple as hell.
field name=class_name type=string indexed=true stored=true
multiValued=false/
Can it be that other fields affect my search if i only do filter query?
Hi
I am using SimplepostTool to post the xml files to SOLR llke :
java -Durl=http://localhost:8080/solr/collection1/update -jar
/var/lib/tomcat6/solr/collection1/dump/xmlinput/post.jar
/var/lib/tomcat6/solr/collection1/dump/xmlinput/solr.xml
When there are certain errors ,the response
I am using solr 4.6 and I am using solr Sharding (Distributed Search). I have
situation where I like to modify the solr search result (DocList and DocSet)
inside solr QueryComponent right after the following method is called from
process() method.
searcher.search(result, cmd);
To be of any help we'd need to know what your documents look like, what
your queries look like, what is the specifications of your server? How much
heap is dedicated to Solr, how much free memory is available for the OS
file cache. You have to figure out the bottleneck. Is it CPU or RAM or
Disk?
I'm wondering what the best practice for large disjunct queries in Solr is.
A user wants to submit a query for several hundred thousand terms, like:
(term1 OR term2 OR ... term500,000)
I know it might be better to break this up into multiple queries that can
be merged on the user's end, but I'm
Are they expecting relevancy ranking or merely seeking to a bulk read of
those documents? Please detail what the user is trying to accomplish with
such a monster list of IDs.
Generally, queries of more than a few dozen terms are a bad idea. If for no
other reason than that if you need to
Can you make a custom Component? They are pluggable.
Regards,
Alex
On 09/06/2014 6:24 pm, Vishnu Mishra vdil...@gmail.com wrote:
I am using solr 4.6 and I am using solr Sharding (Distributed Search). I
have
situation where I like to modify the solr search result (DocList and
DocSet)
I’ve certainly go for the 2nd option. Depending of what you need you won’t need
to modify Solr itself but extend it using different plugins for what you need.
You’ll need to write different components depending on your specific
requirements. I definitely recommend the talks from Trey Grainger,
Thanks, Tim. Worked like a charm. Appreciate your timely assistance.
On Sat, Jun 7, 2014 at 9:13 PM, Timothy Potter thelabd...@gmail.com wrote:
Hi Mark,
Sorry for the trouble! I've now made the ami-1e6b9d76 AMI public;
total oversight on my part :-(. Please try again. Thanks Hoss for
I believe it will return the terms that are most similar to the queried terms
but have a greater term frequency than the queried terms. It doesn't actually
care what the term frequencies are, only that they are greater than the
frequencies of the terms you queried on.
I do not know your use
Hi All,
I was curious to know how multiple Collection communication be achieved? If
yes then by what means.
The use case says, having multiple collection I need to query the first
collection and get the unique ids from first collection to query the second
one(Foreign Key Relation). Now if the
No, Anyways thanks Alex, but where is the luke jar?
With Regards
Aman Tandon
On Mon, Jun 9, 2014 at 6:54 AM, Alexandre Rafalovitch arafa...@gmail.com
wrote:
Have you looked at:
https://github.com/DmitryKey/luke
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current
On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote:
[...]
Next major feature: Native Code Optimizations.
In addition to moving more large data structures off-heap(like
UnInvertedField?), I am planning to implement native code
optimizations for certain hotspots. Native code
Just click the 'Releases' link:
https://github.com/DmitryKey/luke/releases
François
On Jun 9, 2014, at 10:43 AM, Aman Tandon amantandon...@gmail.com wrote:
No, Anyways thanks Alex, but where is the luke jar?
With Regards
Aman Tandon
On Mon, Jun 9, 2014 at 6:54 AM, Alexandre
Well, you've omitted information about the most precious resource for
Solr, memory.
That said, this question is impossible to answer in the abstract, see:
http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
Best,
Erick
On Sun, Jun 8, 2014 at
Yeah just got it thanks Fracois :)
With Regards
Aman Tandon
On Mon, Jun 9, 2014 at 8:20 PM, François Schiettecatte
fschietteca...@gmail.com wrote:
Just click the 'Releases' link:
https://github.com/DmitryKey/luke/releases
François
On Jun 9, 2014, at 10:43 AM, Aman Tandon
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Deepy-nested-structure-tp4140397p4140802.html
Sent from the Solr - User mailing list archive at Nabble.com.
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/Deepy-nested-structure-tp4140397p4140803.html
Sent from the Solr - User mailing list archive at Nabble.com.
My first answer is don't do it that way :).
Solr works best with flattened (de-normlized) data. If at all
possible, you _really_ would be better off combining the two
collections and flattening the data even though there would be more
data.
Whenever I see a question like this, I wonder if you're
Dear Solr expert.
I have 2 problems need your help.
1) I have to group list with group.limit=1group.main=truegroup.sort=Date
desc (many group and each group has 1 element is newest). Then from list
group (each group has 1 element), I want to filter in order to remove items
(in groups) not matches
On 6/8/2014 4:17 PM, shushuai zhu wrote:
I would like to get some advice to setup a Solr Cloud on a set of powerful
machines. The average size of the documents handled by the Solr Cloud is
about 0.5 KB, and the number of documents stored in Solr Cloud could reach
billions. When indexing,
On 6/8/2014 12:09 PM, rashi gandhi wrote:
I am using SolrMeter for performance benchmarking. I am able to
successfully test my solr setup up to 1000 queries per min while
searching.
But when I am exceeding this limit say 1500 search queries per min,
facing Server Refused Connection in SOLR.
Hi Chris,
Created ticket https://issues.apache.org/jira/browse/SOLR-6154
Included to the ticket the data.xml and a PDF with instructions on how to
replicate.
Sending different updates to different ports was just how the confluence
tutorial made the steps; it does not affect the result of the
hi,
prod: p
cat : catA,catB,catC
prod :q
cat : catB, catC,catD
My schema consists of documents with uid : 'prod's and then they belong can
to multiple categories called 'cat' and which are represented as a
multivalued field. For a particular kind of query I need to access
individual elements
Hi,
We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. On some of the
boxes we have about 5 million deleted docs and we have never run optimization
since beginning. Does number of deleted docs have anything to do with
performance of query? Should we consider optimization at all
Hey guys,
I'm trying to simply create collection foo in SolrCloud (to a collection
that failed to create once due to a badly formatted schema).
I try the following:
createCollection foo - could not create a new core
solr/foo_shard1_replica1 as another core is already defined there
Hi,
I don't remember last time I ran optimize. Sure, yes, things will work
faster if you optimize an index and reduce the number of segments, but if
you are regularly writing to that index and performance is OK, leave it to
Lucene segment merges to purge deletes.
Otis
--
Performance Monitoring
the incoming document rate could be as high as 20k/second...
That sounds like a lot of CPU eager indexing work, given the 128 CPU cores
available, from indexing speed perspective: would you recommend having a
similar number of solr cores created, or Solr does just a when several with
a small
Check out the patch on the issue below. We hit the same issue and posted a
patch, none of the committers have picked it up yet, but would be good to
get some feedback on it and get this into the next dot release. If it works
for you, please vote it up.
Not currently.
You could have separate explicit fields for the categories such as cat_1,
cat_2, etc. The data would need to be replicated (possibly using a
copyField), but redundancy to facilitate access is a reasonable approach.
-- Jack Krupansky
-Original Message-
From:
Thanks for the response Jack
--
View this message in context:
http://lucene.472066.n3.nabble.com/accessing-individual-elements-of-a-multivalued-field-tp4140862p4140911.html
Sent from the Solr - User mailing list archive at Nabble.com.
hi all,
I need help simplifying my query. The doc structure is as follows.
docStructure
id A
cat : p, q, r
id B
cat : m, n ,o
id C
cat: l,b, o
Now given this structure my job is to find documents which have cat ids
belonging to a list. Right now this is achieved in this fashion using OR of
Hi Aman,
Yeah, We are also thinking the same. Using UIMA is better. And thanks to
everyone. You guys really showed us the way(UIMA).
We'll work on it.
Thanks,
Vivek
On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com wrote:
Hi Vikek,
As everybody in the mail list mentioned
37 matches
Mail list logo