Re: How real-time are Solr/Lucene queries?

2010-05-26 Thread Amit Nithian
This is an interesting discussion and I have a few questions: 1) My apologies but I haven't been following the NRT patch beyond what was presented at a meetup some months back and the wiki but what is the status of it in Solr? 2) What are typical/accepted definitions of Real Time vs Near Real

Re: question about indexing...

2010-05-26 Thread Jörg Agatz
Ok, Done... But no changes! I have the following in the Schema.xml Made: field name=all type=string indexed=true stored=true multiValued=true/ field name=P_CONTENT_ITEMS_COMMENT type=text indexed=true stored=true multiValued=true/ field name=comment type=string indexed=true stored=true

Re: question about indexing...

2010-05-26 Thread Jörg Agatz
Sorry, i mean: The XML like This: field name=P_CONTENT_ITEMS_COMMENT![CDATA[ Hallo leute. mein name ist dein name und wir wollen eigentlich nur unsere Ruhe haben. bich du er sie es/b Ha ha Ha ha ha ha ha ha ha ha ]]/field

Re: question about indexing...

2010-05-26 Thread Jörg Agatz
OK, Done.. i reboot the Server. Now it works.. is the Textfield Single instance? how can i make it? In textfield indext the Word : Hallo if i search Hallo i found hallo i found Hall* i dont hall* i found But some user will search Hall* One more little Question i have... The Difference from

Re: sort by field length

2010-05-26 Thread Sascha Szott
Hi Erick, Erick Erickson wrote: Ah, I may have misunderstood, I somehow got it in my mind you were talking about the length of each term (as in string length). But if you're looking at the field length as the count of terms, that's another question, sorry for the confusion... I have to ask,

Re: Solr Cell and encrypted pdf files

2010-05-26 Thread Yiannis Pericleous
I've opened an issue and sumbitted a patch https://issues.apache.org/jira/browse/SOLR-1929 Chris Hostetter wrote: : I can't seem to get solr cell to index password protected pdf files. : I can't figure out how to pass the password to tika and looking at : ExtractingDocumentLoader, : it doesn't

Re: question about indexing...

2010-05-26 Thread Erik Hatcher
On May 26, 2010, at 3:49 AM, Jörg Agatz wrote: is the Textfield Single instance? how can i make it? I'm not sure what you're asking. You can have as many text fields as you like, or as many of any other type as well. In textfield indext the Word : Hallo if i search Hallo i found hallo

Re: Solr read-only core

2010-05-26 Thread Mark Miller
On 5/25/10 10:08 PM, Yao wrote: My motivation is more from the performance prospective than functional prospective. I was hoping by opening the Solr index/core read-only, underlying Lucene IndexReader can be opened in read-only mode for optimum query performance (removing the overhead of

fl and nulls

2010-05-26 Thread dan sutton
Hi, In Solr 1.3 it looks like null fields were returned if requested with the fl param,, whereas with solr 1.4, nulls are omitted entirely. Is there a way to have the nulls returned with Solr 1.4 e.g. ... doc field1/ field2/ /doc Cheers, Dan

Re: fl and nulls

2010-05-26 Thread Yonik Seeley
On Wed, May 26, 2010 at 6:12 AM, dan sutton danbsut...@gmail.com wrote: In Solr 1.3 it looks like null fields were returned if requested with the fl param,, whereas with solr 1.4, nulls are omitted entirely. Can you elaborate on what you mean by null? Is this a string field with a zero length

Re: sort by field length

2010-05-26 Thread Erick Erickson
Take a look at the scoring algorithm on the Wiki, it already takes this into account, albeit modified by how many times the term is mentioned in the field. So a field with 5 terms and one match will score higher than one with 10 terms and one match. Where it lands with 10 terms and 2 matches I

Any realtime indexing plugin available for SOLR

2010-05-26 Thread bbarani
Hi, Sorry if I am asking this question again in this forum.. Is there any plugin which I can use to do a realtime indexing? I have a requirement where we have an application which sits on top of SQL server DB and updates happen on day to day basis. Users would like to see the changes made to

Re: How real-time are Solr/Lucene queries?

2010-05-26 Thread Walter Underwood
On May 25, 2010, at 11:24 PM, Amit Nithian wrote: 2) What are typical/accepted definitions of Real Time vs Near Real Time? Real time means that an update is available in the next query after it commits. Near real time means that the delay is small, but not zero. This is within a single

Re: Any realtime indexing plugin available for SOLR

2010-05-26 Thread Marco Martinez
Maybe this will help you http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/5/26 bbarani bbar...@gmail.com Hi, Sorry if I am

RE: Any realtime indexing plugin available for SOLR

2010-05-26 Thread Nagelberg, Kallin
I'm afraid nothing is completely 'real-time'. Even when doing your inserts on the database there is time taken for those operations to complete. Right now I have my solr server autocommiting every 30 seconds, which is 'real-time' enough for me. You need to figure out what your threshold is, and

nested querries, and LocalParams syntax

2010-05-26 Thread Jonathan Rochkind
So I'm trying to wrap my head around nested querries. Also that thing that isn't a nested query, but is similar, which I think is called LocalParams syntax, like: q={!dismax qf=$something}cat dog (All my examples are not URL-encoded for clarity, of course they'd have to be before sending to

Re: Any realtime indexing plugin available for SOLR

2010-05-26 Thread Dennis Gearon
I thought that if entries were COMMITed to the index, they were immediately visible? Is this true, or am I smoking Java coffee beans? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at

Re: How real-time are Solr/Lucene queries?

2010-05-26 Thread Thomas J. Buhr
What about my situation? My renderers need to query the index for fast access to layout and style info as I already described about 3 messages ago on this thread. Another scenario is having automatic queries triggered as my midi player iterates through the model. As the player encounters

Re: nested querries, and LocalParams syntax

2010-05-26 Thread Yonik Seeley
Have you seen http://wiki.apache.org/solr/LocalParams It may answer some of the questions, such as stating that backslash escaping works within quoted strings. I'd encourage you to try things out with the example server and adding debugQuery=true to your requests... it's the easiest way to

Re: SOLR-343 date facet mincount patch

2010-05-26 Thread Umesh_
Chris, The date facet mincount works now. Sorry, my bad, after applying the patch, I did not compile the source. After compiling the source it works. Thanks, Umesh Kant -- View this message in context:

XSLT for JSON

2010-05-26 Thread stockii
Hello. I have a little/big problem. i want to change the response format from the TermsComponent. It is possible to change with XSLT from XML to my JSON format ? or with xslt from json to json ... ;-) the new JSON format should exactly the same like the standard response ... thx -- View

RE: How real-time are Solr/Lucene queries?

2010-05-26 Thread Nagelberg, Kallin
Searching is very fast with Solr, but no way as fast as keying into a map. There is possibly disk I/O if your document isn't cached. Your situation sounds unique enough I think you're going to need to prototype to see if it meets your demands. Figure out how 'fast' is 'fast' for your

Re: XSLT for JSON

2010-05-26 Thread Erik Hatcher
Could you elaborate on your use case? Why do you need a different format? XSLT certainly could produce JSON, but that seems a mighty ugly route to go. The VelocityResponseWriter could also write out JSON, but maybe what you really want is some basic output that VrW could generate? Or

ApacheCon CFP Closes on Friday

2010-05-26 Thread Grant Ingersoll
If you are planning on submitting for ApacheCon, you have until Friday to do so See the CFP at http://blogs.apache.org/conferences/date/20100428

Re: XSLT for JSON

2010-05-26 Thread Jon Baer
You should already get this out of the box ... just tack on a wt=json to the params ie ... http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=10indent=onqt=tvrhtv=truetv.tf=truetv.df=truetv.positionstv.offsets=truewt=json If you look @ /apache-solr-1.4.0/contrib/velocity/src/main

RE: seemingly impossible query

2010-05-26 Thread Nagelberg, Kallin
I developed a solution to this problem and I thought I should share it in case someone encounters a similar problem. Recap: My problem was that for every document in my index I needed to know if it was the most recent that contained an ID in a multi-valued field. Doing this for one ID was

Re: ApacheCon CFP Closes on Friday

2010-05-26 Thread Jason Rutherglen
Grant, the link's broken? http://blogs.apache.org/conferences/date/20100428 Unexpected Exception Status Code 500 Message You have closed the EntityManager, though the persistence context will remain active until the current transaction commits. Type Exception Roller has

solr configuration for Subversion

2010-05-26 Thread Stefan Maric
I've seen the info about SvnQuery wondered if anyone has a Solr configuration / loader module regards Stefan Maric

minpercentage vs. mincount

2010-05-26 Thread Lukas Kahwe Smith
Hi, Obviously I could implement this in userland (like like mincount for that matter), but I wonder if anyone else see's use in being able to define that a facet must match a minimum percentage of all documents in the result set, rather than a hardcoded value? The idea being that while I might

Highlighting questions

2010-05-26 Thread Blargy
What are the correct for settings to get highlighting excerpting working? Original Text: The quick brown fox jumps over the lazy dog Query: jump Result: fox jumps over Can you do something like the above with the highlighter or can it only surround matches with pre and post tags? Can

Re: solr configuration for Subversion

2010-05-26 Thread Chris Hostetter
: I've seen the info about SvnQuery wondered if anyone has a Solr : configuration / loader module I've never heard of SvnQuery until your email, but it seems to be built using Lucene.Net... http://svnquery.tigris.org/ If you're looking for tools for indexing subversion repos with

Re: Solr Architecture discussion

2010-05-26 Thread Chris Hostetter
: 4- trigger swap between core 1 and core2 : 5- At this point Slave index has been renewed ... we can revert back to the : previous index if there was any issues with the new one. these steps are largely unneccessary -- within a single SolrCore Solr already keeps track of the current searcher

Re: Need guidance on schema type

2010-05-26 Thread Lance Norskog
If you use the stripping filter, the stored text is the original HTML. You can then highlight text inside the HTML. If you use the stripping DIH transformer, you will store the stripped text. It will be somewhat smaller. You can highlight the stripped text blobs, but you can't highlight the

Re: searching documents in solr

2010-05-26 Thread Lance Norskog
solr/admin/analysis.jsp allows you to explore what the analyser stacks do. This is the best way to learn. On Wed, May 26, 2010 at 6:13 PM, Chris Hostetter hossman_luc...@fucit.org wrote: To add to he excellent advise so far: when asking a question, please be explicit and show actual URLs used

Re: Dynamic analyzers

2010-05-26 Thread Lance Norskog
If you want to OR a search across many language inputs, you can copy all of the text into an all-languages field. A pan-language search would just hit that field. On Mon, May 24, 2010 at 9:28 AM, dan sutton danbsut...@gmail.com wrote: Hi, I have a requirement to dynamically choose a fieldType

Re: Machine utilization while indexing

2010-05-26 Thread Chris Hostetter
: So now I wonder why BinaryRequestWriter (and BinaryUpdateRequestHandler) : aren't turned on by default. (eps considering some threads on the dev-list I don't really understand this question -- the BinaryUpdateRequestHandler is registered with the path /update/javabin in the example

Re: Dynamic analyzers

2010-05-26 Thread Jan Høydahl / Cominvent
You'll have a hard time supporting stemming etc with this approach. Perhaps a hybrid solution, querying across the all-languages field and a few selected Language specific fields which receive proper linguistic treatment? qf=text_all text_en^2.0 text_de^1.5 Jan Høydahl On 27. mai 2010, at