Re: Solr - Mutivalue field search on different elements

2011-12-28 Thread Ahmet Arslan
i can't delete 1s ,2s ...etc from my field value , i have to keep text in this format... so i'll apply slop in my search to do my needed search done. It is OK if you cant delete 1s, 2s, etc from field value. We can eat up those special markups in analysis chain.

Re: Solr - Mutivalue field search on different elements

2011-12-28 Thread meghana
Hi Kogi , Thanks for reply. I tried by adding BoundaryScanner in my solrconfig.xml and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. my solr config setting is as below

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread meghana
Hi Kogi , Thanks for reply. I tried by adding BoundaryScanner in my solrconfig.xml and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. my solr config setting is as below

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread Ahmet Arslan
I tried by adding BoundaryScanner in my solrconfig.xml  and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. do i missing anything , or doing anything wrong?? i like to make a

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread Koji Sekiguchi
(11/12/28 17:08), Ahmet Arslan wrote: FastVectorHighlighter requires Solr3.1 http://wiki.apache.org/solr/HighlightingParameters#hl.useFastVectorHighlighter Right. In addition, baoundaryScanner requires 3.5. koji -- http://www.rondhuit.com/en/

Re: solr keep old docs

2011-12-28 Thread Lance Norskog
The SignatureUpdateProcessor is for exactly this problem: http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/Deduplication On Tue, Dec 27, 2011 at 10:42 PM, Alexander Aristov alexander.aris...@gmail.com wrote: I get docs from external sources and the only place I keep

Re: How to run the solr dedup for the document which match 80% or match almost.

2011-12-28 Thread Lance Norskog
You would have to implement this yourself in your indexing code. Solr has an analysis plugin which does the analysis for your text and then returns the result, but does not query or index. You can use this to calculate the fuzzy hash, then search against index. You might be able to code this in

Was:Re: hl.boundaryScanner and hl.bs.chars [off topic]

2011-12-28 Thread Tanguy Moal
Dear list, I'd like to bounce on that issue... IMHO, configuration parsing could be a little bit stricter... At least, what stands for a severe configuration error could be user-defined. Let me give some examples that are common errors and that don't trigger the abortOnConfigurationError

Re: How can I check if a more complex query condition matched?

2011-12-28 Thread Max
Thanks for your reply, I thought about using the debug mode, too, but the information is not easy to parse and doesnt contain everything I want. Furthermore I dont want to enable debug mode in production. Is there anything else I could try? On Tue, Dec 27, 2011 at 12:48 PM, Ahmet Arslan

RE: Poor performance on distributed search

2011-12-28 Thread ku3ia
Hi all. Due to my code review, I discovered next things: 1) as I wrote before, seems there is a low disk read speed; 2) at ~/solr-3.5/solr/core/src/java/org/apache/solr/response/XMLWriter.java and in the same classes there is a writeDocList = writeDocs method, which contains a cycle for of all

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread meghana
Thans iorixxx and Koji for your reply , so can i fulfill my needed requirement by using hl.regex.pattern and making hl.fragmenter=regex ?? i was watching on these fields on wiki. i am thinking to use it to make my highlighted text show in my desire format. my string is like below 1s: This is

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
the problem with dedupe (SignatureUpdateProcessor ) is that it REPLACES old docs. I have tried it already. Best Regards Alexander Aristov On 28 December 2011 13:04, Lance Norskog goks...@gmail.com wrote: The SignatureUpdateProcessor is for exactly this problem:

Grouping results after Sorting or vice-versa

2011-12-28 Thread vijayrs
The issue i'm facing is... I didn't get the expected results when i combine group param and sort param. The query is... http://localhost:8080/solr/core1/select/?qt=nutchq=*:*fq=userid:333group=truegroup.field=threadidgroup.sort=date%20descsort=date%20desc where threadid is a hexadecimal string

Indexing problem

2011-12-28 Thread mumairshamsi
http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml i am trying to index this file for this i am using this command java -jar post.jar *.xml commands run fine but when i search not result is displaying I think it is encoding problem can any one help ?? -- View this

Re: Problems while searching in default field

2011-12-28 Thread mechravi25
Hi, Thanks a lot guys. I tried the following options 1.) Downloaded the solr 3.5.0 version and updated the schema.xml file with the sample fields i have. I then tried to set the property ignoreCaseForWildcards=true for a field type as mentioned in the url given for the patch-2438, but got the

Re: Migration from Solr 1.4 to Solr 3.5

2011-12-28 Thread Bhavnik Gajjar
Thanks community! That helps! To check practically, I have now setup Solr 3.5 in test environment. Few observations on that, 1. I simply copy-pasted one of the Solr 1.4 instance on Solr 3.5 setup (after correcting schema.config and solr.config files based on what is suited for 3.5). If

Re: Indexing problem

2011-12-28 Thread Ahmet Arslan
http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml i am trying to index this file for this i am using this command java -jar post.jar *.xml commands run fine but when i search not result is displaying I think it is encoding problem can any one help ??

Re: Indexing problem

2011-12-28 Thread Martin Koch
Could it be a commit you're needing? curl 'localhost:8983/solr/update?commit=true' /Martin On Wed, Dec 28, 2011 at 11:47 AM, mumairshamsi mumairsha...@gmail.comwrote: http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml i am trying to index this file for this i am using

Re: best practice to introducing singletons inside of Solr (IoC)

2011-12-28 Thread Erick Erickson
I must be missing something here. Why would this be any different from any other singleton? I just did a little experiment where I implemented the classic singleton pattern in a RequestHandler and accessed from a Filter (both plugins) with no problem at all, just the usual blah var =

Re: How can I check if a more complex query condition matched?

2011-12-28 Thread Erick Erickson
There's no easy/efficient way that I know of to do this. Perhaps a good question is what value-add this is going to make for your app and is there a better way to convey this information. For instance, would highlighting convey enough information to your user? You're right that you don't want to

Re: solr keep old docs

2011-12-28 Thread Erick Erickson
Well, the short answer is that nobody else has 1 had a similar requirement AND 2 not found a suitable work around AND 3 implemented the change and contributed it back. So, if you'd like to volunteer G. Seriously. If you think this would be valuable and are willing to work on it, hop on over

Re: Problems while searching in default field

2011-12-28 Thread Erick Erickson
Right, you were mislead by the discussion in for that patch, the option you specified was NOT how the patch was eventually implemented. Try reading this page instead: http://wiki.apache.org/solr/MultitermQueryAnalysis The short form is that with 3.6 (i.e. 3.x at this point) you may not have to do

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
Thanks Eric, it sets me direction. I will be writing new plugin and will get back to the dev forum with results and then we will decide next steps. Best Regards Alexander Aristov On 28 December 2011 18:08, Erick Erickson erickerick...@gmail.com wrote: Well, the short answer is that nobody

Re: Poor performance on distributed search

2011-12-28 Thread Yonik Seeley
On Wed, Dec 28, 2011 at 5:47 AM, ku3ia dem...@gmail.com wrote: So, based on p.2) and on my previous researches, I conclude, that the more documents I want to retrieve, the slower is search and main problem is the cycle in writeDocs method. Am I right? Can you advice something in this

Re: solr keep old docs

2011-12-28 Thread Tanguy Moal
Hello Alexander, I don't know much about your requirements in terms of size and performances, but I've had a similar use case and found a pretty simple workaround. If your duplicate rate is not too high, you can have the SignatureProcessor to generate fingerprint of documents (you already did

High response time after being idle

2011-12-28 Thread Odey
Hello, I'm running Solr 3.5 on a XAMPP/Tomcat environment. It's working pretty good for just one exception: when Solr remains idle without handling any requests for about 5-10 mins the first request sent again will be delayed for a few seconds. Subsequent requests are lightning-fast as usual. So

Re: High response time after being idle

2011-12-28 Thread Gora Mohanty
On Wed, Dec 28, 2011 at 8:52 PM, Odey mariofi...@googlemail.com wrote: Hello, I'm running Solr 3.5 on a XAMPP/Tomcat environment. It's working pretty good for just one exception: when Solr remains idle without handling any requests for about 5-10 mins the first request sent again will be

Re: solr keep old docs

2011-12-28 Thread Chris Hostetter
: That said, writing your own update request handler : that detected this case isn't very difficult, : extend UpdateRequestProcessorFactory/UpdateRequestProcessor : and use it as a plugin. i can't find the thread at the moment, but the general issue that has caused people headaches with this

Re: LineEntityProcessor

2011-12-28 Thread Chris Hostetter
You really haven't posted enough details for people to guess as to what your problem might be (in particuar: the actaul examples of your configs, and any log messages during hte import) please consult this wiki page and then post a followup with more details...

Re: XPathEntityProcessor and ExtractingRequestHandler

2011-12-28 Thread Chris Hostetter
: Can I use a XPathEntityProcessor in conjunction with an : ExtractingRequestHandler? Also, the scripting language that : XPathEntityProcessor uses/supports, is that just ECMA/JavaScript? : : Or is XPathEntityProcessor only supported for use in conjuntion with the : DataImportHandler? The

Re: Solr-3.5.0/Nutch-1.4 - SolrDeleteDuplicates fails

2011-12-28 Thread Chris Hostetter
: Exception in thread main java.io.IOException: Job failed! : : at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) : : at : org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373) : : at :

Re: FTP mount crash when crawling with solrj

2011-12-28 Thread Chris Hostetter
: I have a lots of files in my FTP account,and i use the curlftpfs to mount : them to folder and then start index them with solrj api, but after a minutes : pass something strange happen and the mounted folder is not accessible and : crash,also i can not unmount it and the message device is in

Re: edismax doesn't obey 'pf' parameter

2011-12-28 Thread Chris Hostetter
: Of course. What I meant to say was there is : always exactly one token in a non-tokenized : field and it's offset is always exactly 0. There : will never be tokens at position 1. : : So asking to match phrases, which is based on : term positions is basically a no-op. That's not always true.

Facet Ordering

2011-12-28 Thread Jamie Johnson
I've seen in the solr faceting overview that it is possible to sort either by count or lexicographically, but is there a way to sort so the lowest counts come back first?

Re: Grouping results after Sorting or vice-versa

2011-12-28 Thread Juan Grande
Hi, I don't have an answer, but maybe I can help you if you provide more information, for example: - Which Solr version are you running? - Which is the type of the date field? - The output you are getting - The output you expect - Any other information that you consider relevant. Thanks,

Re: High response time after being idle

2011-12-28 Thread Erick Erickson
What else, if anything, do you have running on the server? Because it's possible that pages are being swapped out for other processes to use. Solr itself shouldn't, as far as I know, time out anything so I expect you're running into issues with the op system. Best Erick On Wed, Dec 28, 2011 at

Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Juan Grande
Hi Parvin, You must also add the query parser definition to solrconfig.xml, for example: queryParser name=graph class=org.gasimzade.solr.GraphQParserPlugin/ *Juan* On Wed, Dec 28, 2011 at 4:16 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: Hi all, I have created custom Solr

Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Yonik Seeley
On Wed, Dec 28, 2011 at 2:16 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: I have created custom Solr FunctionQuery in Solr 3.4. I extended ValueSourceParser, ValueSource, Query and QParserPlugin classes. Note that you only need a QParserPlugin implementation for top level query types,

Re: High response time after being idle

2011-12-28 Thread Otis Gospodnetic
Right, I think that's what's happening here. Google swapiness if you are on Linux. Alternatively, one could add something to prevent the OS from swapping out Solr's process.  Here is how ElasticSearch does it, for example: https://github.com/elasticsearch/elasticsearch/issues/464 Otis

Re: High response time after being idle

2011-12-28 Thread Chris Hostetter
: Is it possible that the system is running out of RAM, and swapping, : or is aggressively swapping for some reason? it doesn't have to be the solr /tomcat process memory getting swapped out -- but that's certainly possible -- it could also be that the filesystem cache is expunging the disk

Re: Facet Ordering

2011-12-28 Thread Koji Sekiguchi
(11/12/29 5:50), Jamie Johnson wrote: I've seen in the solr faceting overview that it is possible to sort either by count or lexicographically, but is there a way to sort so the lowest counts come back first? As far as I know, no. What is your use case? koji -- http://www.rondhuit.com/en/

Re: Facet Ordering

2011-12-28 Thread Jamie Johnson
I have a database where a user is searching for documents, and the things which I'm faceting on are tags. Tags boil down to things of interest, perhaps names, places, etc. The user in our case has asked for the ability to change the ordering so they can easily find things that appear very

Re: Facet Ordering

2011-12-28 Thread Chris Hostetter
: I've seen in the solr faceting overview that it is possible to sort : either by count or lexicographically, but is there a way to sort so : the lowest counts come back first? Peter Sturge looked into this a while back and provided a patch, but there were some issues with it that never got

Re: Sort facets by defined custom Collator

2011-12-28 Thread Chris Hostetter
: Subject: Sort facets by defined custom Collator deja-vu... http://www.lucidimagination.com/search/p:solr/s:email/l:user/sort:date?q=%22Facet+Ordering%22 -Hoss

Re: Custom Shingle Factory Filter Requirement

2011-12-28 Thread Vannia Rajan
On Tue, Dec 27, 2011 at 1:10 PM, Ahmet Arslan iori...@yahoo.com wrote: To achieve this behavior, you can use StandardTokenizerFactory and EdgeNGramFilterFactory and LowerCaseFilterFactory at index time. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory

Re: Migration from Solr 1.4 to Solr 3.5

2011-12-28 Thread Lance Norskog
Yes, the 3.5 Solr is opening and reading the Solr 1.4 index. When you do a commit, it will rewrite the index in 3.5 format. Doing a complete copy of the configs from 1.4 to 3.5 is easy, but there are a lot of new features and changed defaults in the solrconfig.xml file. These make indexing

Re: Solr Distributed Search vs Hadoop

2011-12-28 Thread Lance Norskog
Here is an example of schema design: a PDF file of 5MB might have maybe 50k of actual text. The Solr ExtractingRequestHandler will find that text and only index that. If you set the field to stored=true, the 5mb will be saved. If saved=false, the PDF is not saved. Instead, you would store a link

Re: Solr Distributed Search vs Hadoop

2011-12-28 Thread Ted Dunning
This copying is a bit overstated here because of the way that small segments are merged into larger segments. Those larger segments are then copied much less often than the smaller ones. While you can wind up with lots of copying in certain extreme cases, it is quite rare. In particular, if you

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
Unfortunately I have a lot of duplicates and taking that searching might suffer I will try with implementing update procesor. But your idea is interesting and I will consider it, thanks. Best Regards Alexander Aristov On 28 December 2011 19:12, Tanguy Moal tanguy.m...@gmail.com wrote: Hello

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
Yes I have been warned that query index each time before adding doc to index might be resource consuming. Will check it. As for the overwrite parameter I think the name is not the best then. People outside the business like me misuse it and assume what I wrote. Overwrite shall mean what it means.

Re: High response time after being idle

2011-12-28 Thread Odey
It seems like my operation system was causing me trouble in some way. I couldn't find what was triggering this issue, but after migrating the whole project from wamp to lamp it has been resolved and everything is running smoothly again. Thank you very much for your help! Regards, -- View this

Re: solr keep old docs

2011-12-28 Thread Mikhail Khludnev
Alexander, I have two ideas how to implement fast dedupe externally, assuming your PKs don't fit to java.util.*Map: - your crawler can use inprocess RDBMS (Derby, H2) to track dupes; - if your crawler is stateless - it doesn't track PKs which has been already crawled, you can retrieve

Re: Grouping results after Sorting or vice-versa

2011-12-28 Thread Vijayaragavan
Hi Juan, I'm using Solr 3.1 The type of the date field is long. Let's say, the documents indexed in Solr server be.. doc str name=uniqueid1326c5cc09bbc99a_1/str str name=threadid1326c5cc09bbc99a/str long name=date1316078009000/long .. Some Other fields here .. str name=subjectSome

Re: best practice to introducing singletons inside of Solr (IoC)

2011-12-28 Thread Mikhail Khludnev
Erick, Ok. Let me try with plain java one. Possibly I'll need more tight integration like injecting a core into the singleton, etc. But I don't know yet. Thanks for your efforts. On Wed, Dec 28, 2011 at 5:48 PM, Erick Erickson erickerick...@gmail.comwrote: I must be missing something here.

Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Parvin Gasimzade
Thank you for your answers. I have a MapdocId, score and want to boost the score of that documents during search time. In my example i get that map inside ValueSource and boost the matched documents score. In the query if {!graph} is added then it will return boosted query otherwise it will