Re: Multi-words synonyms matching

2012-05-31 Thread Bernd Fehling
Are you sure with LUCENE_33 (Use of BitVector)? Am 31.05.2012 17:20, schrieb O. Klein: > I have been struggling with this as well and found that using LUCENE_33 gives > the best results. > > But as it will be deprecated this is no everlasting solution. May somebody > knows one? >

Re: How can I remove the home page priority of site home page from search results

2012-05-31 Thread Jack Krupansky
Add &debugQuery=true to your query and check how the home page is scored. That should give you a clue why the title is not boosting the score enough. Maybe you simply need a higher boost for title, but let the debugQuery scoring be your guide. Actually, if you are explicitly referencing a fiel

Re: Stop Words in SpellCheckComponent

2012-05-31 Thread Jack Krupansky
Your earlier email had this option in your spellcheck.de field type analyzer for the StopFilterFactory: words="german_stop_long.txt" But your most recent email referred to "stopword.txt". So, either add "the" to german_stop_long.txt, or change the "words" option of your stopfilter to refer to

Re: Stop Words in SpellCheckComponent

2012-05-31 Thread Matthias Müller
> spellcheck_de > > That should reference a field, not a field type. Thanks for your help. But I did that, too. Here I'll show that even the solr example webapp makes suggestions for stopwords: I've ... 1. added "the" to the stopwords.txt 2. added "thex" to an example document (field name) 3. st

Re: index special characters solr

2012-05-31 Thread Jack Krupansky
Special characters are filtered out of (most) "text" fields, but are preserved in "string" fields. String fields might suit your needs, but are inconvenient for keyword searching. You may be able to use the "types" option of the WordDelimiterFilterFactory to pass in a custom character type tab

Challenge: Is dynamic data source possible for DataImportHandler JdbcDataSource?

2012-05-31 Thread Cheng Zhang
Hi, The challenge I'm facing is some sort of dynamic data source. Your valuable input is highly appreciated. Below is my data-config.xml. I have one user database and two company databases. The user table in the user database has four columns which are id + name + company_dbname + company_id.

index special characters solr

2012-05-31 Thread KPK
Hi all Can somebody please tell me how can I build an index in solr where one of my field contains special characters like $ , % I would also like to search on the same characters on that particular field. Any advice would be appreciated. Thanks -- View this message in context: http://lucene.4

Re: Fwd: Data Import Handler fields with different values in column and name

2012-05-31 Thread Rafael Taboada
Hi Jack, Thanks for your help. I delete conf/data/* every restart so make sure to work with clean data. is there any other config I should do?. Maybe another xml file. Kind regards On Thu, May 31, 2012 at 5:18 PM, Jack Krupansky wrote: > It looks okay; renaming a column is fine. > > Maybe...

Re: Fwd: Data Import Handler fields with different values in column and name

2012-05-31 Thread Jack Krupansky
It looks okay; renaming a column is fine. Maybe... maybe when you re-run it DIH is not replacing any documents that already have id's in Solr, leaving them with their old field values. Maybe you need to manually delete the old Solr documents and run a fresh full import. -- Jack Krupansky --

Re: Solr with UIMA

2012-05-31 Thread Jack Krupansky
Is it failing on the first document? I see "uid 5", suggests that it is not. If not, how is this document different from the others? I see the exception org.apache.uima.resource.ResourceInitializationException, suggesting that some file cannot be loaded. It sounds like it may be having troubl

Re: Strip html

2012-05-31 Thread Chris Hostetter
: I make a transformation XSLT which return : : --- : si les ruches d’abeilles prouvent la : monarchie, les fourmillières, les troupes d’éléphants ou : de castors prouvent la république. : --- : i put this ht

Re: possible status codes from solr during a (DIH) data import process

2012-05-31 Thread jmlucjav
there is at least one scenario where no error is reported when it should be, if the host runs out of disk when optimizing, it is not reported. There is a jira issue open I think -- View this message in context: http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-

Fwd: Data Import Handler fields with different values in column and name

2012-05-31 Thread Rafael Taboada
Please, Can anyone guide me through this issue? Thanks -- Forwarded message -- From: Rafael Taboada Date: Thu, May 31, 2012 at 12:30 PM Subject: Data Import Handler fields with different values in column and name To: solr-user@lucene.apache.org Hi folks, I'm using Solr 3.6 a

Re: possible status codes from solr during a (DIH) data import process

2012-05-31 Thread Rahul Warawdekar
Hi, Thats correct. For failure, you have to check for the text *"Indexing failed. Rolled back changes"* under the tag. One more thing to note here is that there may be a time during the indexing process where the indexing is complete but the index is not committed and optimized yet. You would nee

RE: possible status codes from solr during a (DIH) data import process

2012-05-31 Thread Dyer, James
You've got it right. Here's a summary: - "status" = "busy" means its in-process. - "status" = "idle" means its finished (success or failure). - You can drill down further by looking at sub-elements under "statusMessages" : > if there is , it means the last import was cancelled > with "comma

Re: index merge

2012-05-31 Thread sudarshan
Hi All, I have a basic doubt about index merging in Solr. The setup that I have followed is as follows: Setup: I used the schema.xml that comes with the solr example. I had three cores - core0, core1 and core2. I tried merging the indexes of core 0 and core 1 to core2. I copied the same

Re: Cannot get highlighting to work

2012-05-31 Thread Jack Krupansky
Try a query that uses a term that doesn't split an alphanumeric term into two terms. Then check to see what field type you used for the symbol and marker_symbol fields and whether the analyzer for that field type has changed in 3.6. -- Jack Krupansky -Original Message- From: Asfan

Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter

2012-05-31 Thread Lance Norskog
Can you add a new stored procedure that uses your current one? It would operate like the DIH expects. I don't remember if DB cursors are a standard part of JDBC. If they are, it would be a great addition to the DIH if they work right. On Thu, May 31, 2012 at 10:44 AM, Niran Fajemisin wrote: > Th

Re: Merging Remote Solr Indexes?

2012-05-31 Thread Lance Norskog
Merging indexes is not really useful- it won't make distributed search any faster. There are features that don't work with distributed search. Really, you are better off having shards with enough documents so that relevance scoring is balanced. On Thu, May 31, 2012 at 11:04 AM, sudarshan wrote: >

Re: Is optimize needed on slaves if it replicates from optimized master?

2012-05-31 Thread Walter Underwood
http://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor The defaults are very good. I have never changed them, and I've had Solr in production at two major sites, Netflix and Chegg. Don't spend any more time worrying about merges. wunder On May 31, 2012, at 10:51 AM, sudarshan wrote: >

possible status codes from solr during a (DIH) data import process

2012-05-31 Thread geeky2
hello all, i have been asked to write a small polling script (bash) to periodically check the status of an import on our Master. our import times are small, but there are business reasons why we want to know the status of an import after a specified amount of time. i need to perform certain acti

Re: Stop Words in SpellCheckComponent

2012-05-31 Thread Jack Krupansky
Spellcheck wants a field, not a field type. You have a spellcheck_de field type, but you need a field as well. spellcheck_de That should reference a field, not a field type. -- Jack Krupansky -Original Message- From: Matthias Müller Sent: Thursday, May 31, 2012 3:23 PM To: solr-user

Fwd: Strip html

2012-05-31 Thread Michael Della Bitta
If I'm not mistaken, that's TEI, and I suggest you consult with the TEI community for strategies for document indexing, as there are a lot of branching-style tags in TEI. My guess is that you'll hear that it's best to perform some sort of term expansion on the document as a preprocessing step. Mic

Re: Stop Words in SpellCheckComponent

2012-05-31 Thread Matthias Müller
>> is it possible to configure a stopword list to the SpellCheckComponent? > Add a stopwordfilter to your spellcheck field. Hmm, I did. Could it be another mistake? This is the schema definition: This is the solrconfig:

Re: Strip html

2012-05-31 Thread Jack Krupansky
There is no option in the Strip HTML filter to discard whitespace between elements. And it certainly doesn't know the semantics of some XML schema for "choice". You'll have to pre-process that semantics before Solr ingestion, or do your own custom filter. -- Jack Krupansky -Original Messa

Re: Data Import Handler fields with different values in column and name

2012-05-31 Thread Rafael Taboada
Jack, Thanks for your help. I restarted solr when I was changing schema.xml anytime. Any doc about this mentions it is possible to map the column with another name value. But I can't. Thanks again. Rafael On Thu, May 31, 2012 at 1:27 PM, Jack Krupansky wrote: > Is there any chance that you a

Re: Data Import Handler fields with different values in column and name

2012-05-31 Thread Jack Krupansky
Is there any chance that you added the "anotherasunto" field and then forgot to shut down and reload Solr? Any time you edit schema.xml or solrconfig.xml you need to reload Solr for the changes to take effect. -- Jack Krupansky -Original Message- From: Rafael Taboada Sent: Thursday,

Re: Merging Remote Solr Indexes?

2012-05-31 Thread sudarshan
Hi All, I'm new to Solr. I saw this post relating to Merging of indexes. I have a similar doubt. From the post, I understand that merging of indexes across different cores is possible only if the cores exist o a single machine. I want to merge indexes of different machines. Can you pleas

Re: Is optimize needed on slaves if it replicates from optimized master?

2012-05-31 Thread sudarshan
Walter, Thanks again. Can you specify the criteria based on which Solr optimizes/force merges segments automatically. Is this defined by the MergeFactor parameter - like if the mergefactor is 10, then merge happens for every 10 segments? Please explain. Thanks, Sudarshan -- View this

RE: Stop Words in SpellCheckComponent

2012-05-31 Thread Markus Jelsma
Add a stopwordfilter to your spellcheck field. -Original message- > From:Matthias Müller > Sent: Thu 31-May-2012 18:39 > To: solr-user@lucene.apache.org > Subject: Stop Words in SpellCheckComponent > > Hi, > > is it possible to configure a stopword list to the SpellCheckComponent? > >

Re: Solr with UIMA

2012-05-31 Thread debdoot
Further observation on the error: All requests to add documents through the /update URL land up with the same error, irrespective of the fields contained in the document. If I don't use the UIMAUpdateRequestProcessor, I can add/update documents successfully. Here are the snippets relevant to upda

Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter

2012-05-31 Thread Niran Fajemisin
Thanks for your response, Michael. Unfortunately changing the stored procedure is not really an option here.  From what I'm seeing, it would appear that there's really no way of somehow instructing the Data Import Handler to get a handle on the output parameter from the stored procedure. It's a

Data Import Handler fields with different values in column and name

2012-05-31 Thread Rafael Taboada
Hi folks, I'm using Solr 3.6 and I'm trying to import data from my database to solr using Data Import Handler. My db-config is like this: My problem is when I'm trying to use a different values in the field tag, for example Wh

Strip html

2012-05-31 Thread Tigunn
Hello, I have an index full text on xml files. Exemple: --- si les ruches d’abeilles > prouvent la > monarchie, les fourmillières, les troupes d

Re: Solr with UIMA

2012-05-31 Thread debdoot
Hi Tommaso, I have followed the steps you have listed to try to deploy the example RoomNumberAnnotator with Solr 3.5. Here is the error trace that I get: org.apache.solr.common.SolrException: processing error: null. uid=5, text="Test Room HAW GN-K35..." at org.apache.solr.uima.processor

Stop Words in SpellCheckComponent

2012-05-31 Thread Matthias Müller
Hi, is it possible to configure a stopword list to the SpellCheckComponent? For example: When searching for "the indexs" "the" is filtered, because it is a stopword. The SpellCheckComponent gives me a false suggestion for "the". But the SpellCheckComponent should only give a suggestion for "index

Cannot get highlighting to work

2012-05-31 Thread Asfand Qazi
Hello, I am having problems doing highlighting a Solr 3.6 instance, while it was working just fine before on our 1.4 instance. The solrconfig.xml and schema.xml files are located here: https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/schema.xml (please note the incorrect lin

Re: per-fieldtype similarity not working

2012-05-31 Thread Robert Muir
On Thu, May 31, 2012 at 11:23 AM, Markus Jelsma wrote: > We simply declare the following in our fieldType: > > Thats not enough, see the example: http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/schema-sim.xml -- lucidimagination.com

per-fieldtype similarity not working

2012-05-31 Thread Markus Jelsma
Hi, We intend to use different similarity implemenations for some field types configured according to SOLR-2338. I doubled checked with the schema in test-files and everything seems fine. However, the result is not correct and debugQuery shows the default configured similarity implementation is

Re: Multi-words synonyms matching

2012-05-31 Thread O. Klein
I have been struggling with this as well and found that using LUCENE_33 gives the best results. But as it will be deprecated this is no everlasting solution. May somebody knows one? -- View this message in context: http://lucene.472066.n3.nabble.com/Multi-words-synonyms-matching-tp3898950p398704

Re: Accent Characters

2012-05-31 Thread Vicente Couto
Hello, guys. Now it's working. Thank you both Jack and Sami. I fixed my issue by just using server.query(query, METHOD.POST) in solrJ and yes, I was using HttpSolrServer. I have to move on to CommonsHttpSolrServer. Thank you very much. -- View this message in context: http://lucene.472066.n3.na

Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter

2012-05-31 Thread Michael Della Bitta
I could be wrong about this, but Oracle has a table() function that I believe turns the output of a function as a table. So possibly you could wrap your procedure in a function that returns the cursor, or convert the procedure to a function. Michael Della Bitta ---

RE: spellcheck collate with fq parameters SOLR-2010

2012-05-31 Thread Markus Jelsma
Thanks James, that works nicely! -Original message- > From:Dyer, James > Sent: Thu 31-May-2012 16:05 > To: solr-user@lucene.apache.org > Subject: RE: spellcheck collate with fq parameters SOLR-2010 > > Markus, > > When you set "spellcheck.maxCollationTries" to a value greater than ze

RE: spellcheck collate with fq parameters SOLR-2010

2012-05-31 Thread Dyer, James
Markus, When you set "spellcheck.maxCollationTries" to a value greater than zero, the spellchecker will query each collation candidate to determine how many hits it would return. If the collation will not yield any hits, it throws it away then tries some more (up to whatever value you set). Y

Re: XInclude Multiple Elements

2012-05-31 Thread Bogdan Nicolau
I've also tried a lot of tricks to get xpointer working with multiple child elements, to no success. In the end, I've resorted to a less pretty, other-way-around solution. I do something like this: solrconfig_common.xml -> no xml declaration, no root tag, no nothing ... For each file that I need

Re: Hightlighting and excerpt

2012-05-31 Thread Ahmet Arslan
> I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB > was stressed? Hi Tolga, I think, you can easily learn the basic using one of the following books. http://lucene.apache.org/solr/books.html

Re: Hightlighting and excerpt

2012-05-31 Thread Jack Krupansky
The Solr example. As in the Solr tutorial. See: http://lucene.apache.org/solr/api/doc-files/tutorial.html Index books.json from exampledocs and then enter a /browse request in your web browser. Add the "&wt=xml" query parameter so that you can see the raw XML response that shows the "highlight

spellcheck collate with fq parameters SOLR-2010

2012-05-31 Thread Markus Jelsma
Hi, It seems it doesn't work or i cannot get it to work. I've tried both the IndexSpellchecker in Solr 3.2 and the DirectSpellchecker of trunk. The correctly spelled flag is correct when considering the fq parameters but the collation is never when using a filter. I've also tried spellcheck.ma

Re: Hightlighting and excerpt

2012-05-31 Thread Tolga
You mean http:///www.example.com:8983/solr/browse? It says "unknown field 'cat'" On 5/31/12 4:16 PM, Jack Krupansky wrote: Yes, that is what highlighting does - it extracts an excerpt and highlights search terms. You said you have highlighting working, so what else is it that you need? Try "

Re: Hightlighting and excerpt

2012-05-31 Thread Jack Krupansky
Yes, that is what highlighting does - it extracts an excerpt and highlights search terms. You said you have highlighting working, so what else is it that you need? Try "/browse" in the Solr example. It does exactly what your example shows. So, what else is it that you are trying to do? Or if s

Efficiently mining or parsing data out of XML source files

2012-05-31 Thread Van Tassell, Kristian
I'm just wondering what the general consensus is on indexing XML data to Solr in terms of parsing and mining the relevant data out of the file and putting them into Solr fields. Assume that this is the XML file and resulting Solr fields: XML data: foo garbage data Solr Fields: Id=1234 Title

Re: Hightlighting and excerpt

2012-05-31 Thread Tolga
I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB was stressed? On 5/31/12 3:54 PM, Jack Krupansky wrote: Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky ---

Re: Hightlighting and excerpt

2012-05-31 Thread Jack Krupansky
Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 4:55 AM To: solr-user@lucene.apache.org Subject: Hightlighting

Re: Query elevation / boosting or something else to guarantee document position

2012-05-31 Thread Michael Kuhlmann
Hi Wenca, I'm a bit late. but maybe you're still interested. There's no such functionality in standard Solr. With sorting, this is not possible, because sort functions only rank each single document, they know nothing about the position of the others. And query elevation is similar, you'll ra

Using Data Import Handler to invoke a stored procedure with output (cursor) parameter

2012-05-31 Thread Niran Fajemisin
Hi all, I've seen a few questions asked around invoking stored procedures from within Data Import Handler but none of them seem to indicate what type of output parameters were being used. I have a stored procedure created in Oracle database that takes a couple input parameters and has an outpu

Re: difference between Katta and SolrCloud (replicator factor)

2012-05-31 Thread Jamel ESSOUSSI
Hi, responses please -- Jamel E -- View this message in context: http://lucene.472066.n3.nabble.com/difference-between-Katta-and-SolrCloud-replicator-factor-tp3986791p3986998.html Sent from the Solr - User mailing list archive at Nabble.com.

AW: Creating custom Filter / Tokenizer / Request Handler for integration of NER-Framework

2012-05-31 Thread Wunderlich, Tobias
Thanks for all the responses. I went with the UpdateRequestProcessor and it works. -Ursprüngliche Nachricht- Von: Lance Norskog [mailto:goks...@gmail.com] Gesendet: Samstag, 26. Mai 2012 01:53 An: solr-user@lucene.apache.org Betreff: Re: Creating custom Filter / Tokenizer / Request Hand

Hightlighting and excerpt

2012-05-31 Thread Tolga
Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,

Re: Poll: What do you use for Solr performance monitoring?

2012-05-31 Thread Vadim Kisselmann
Hi Otis, done :) Till now we use Graphite, Ganglia and Zabbix. For our JVM monitoring JStatsD. Best regards Vadim 2012/5/31 Otis Gospodnetic : > Hi, > > Super quick poll:  What do you use for Solr performance monitoring? > Vote here: > http://blog.sematext.com/2012/05/30/poll-what-do-you-use-for

Re: how to read fieldValueCacheStatistics

2012-05-31 Thread elisabeth benoit
ok, thanks a lot for the answer. Elisabeth 2012/5/31 Chris Hostetter > > : When I read fieldValueCache statistics I have something that looks like > : > : item_ABC_FACET : > : > {field=ABC_FACET,memSize=4224,tindexSize=32,time=92,phase1=92,nTerms=0,bigTerms=0,termInstances=0,uses=11} > : > : >