Re: Multiple Process of the SAME solr instance

2008-09-18 Thread mohitranka
Shalin, I understand that :-) My problem is, if 1 solr instance process(save) 100 documents one-by-one, it would not be very effective, I want to create 10 clones (process/threads/cores) of the same solr instance, so that 10 documents get processed(saved to solr) simaltaneously.

Re: Solr vs Autonomy

2008-09-18 Thread Otis Gospodnetic
Geoff, Perhaps you can find out the list of features/functionalities that your project requires and we can give you quick yes/no. Or perhaps you can get those others to list those Autonomy features that they think they really need, and we can tell you how Solr compares. Otis -- Sematext --

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread Otis Gospodnetic
A quick work-around is, I think, to tell Solr to use the non-binary response, e.g. wt=xml (I think that's the syntax). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: syoung [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent:

Re: Multiple Process of the SAME solr instance

2008-09-18 Thread Otis Gospodnetic
Mohit, I think you are thinking too hard - trying to optimize something that doesn't sound like it needs optimizing at this point in your project. I suggest you start with 1 Solr instance and then see if anything needs to be faster after you've pushed that to its limits. Otis -- Sematext --

Re: Field level security

2008-09-18 Thread Otis Gospodnetic
Hi, I don't understand all the details, but I'll inline a few comments. - Original Message From: Geoff Hopson [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Thursday, September 18, 2008 1:44:33 AM Subject: Field level security Hi, First post/question, so please be

Re: Special character matching 'x' ?

2008-09-18 Thread Norberto Meijome
On Thu, 18 Sep 2008 10:53:39 +0530 Sanjay Suri [EMAIL PROTECTED] wrote: One of my field values has the name R__ikk__nen which contains a special characters. Strangely, as I see it anyway, it matches on the search query 'x' ? Can someone explain or point me to the solution/documentation?

Re: Field level security

2008-09-18 Thread Geoff Hopson
Hi Otis, Thanks for the response. I'll try and inline some clarity... 2008/9/18 Otis Gospodnetic [EMAIL PROTECTED]: I am trying to put together a security model around fields in my index. My requirement is that a user may not have permission to view certain fields in the index when he does a

Re: Solr vs Autonomy

2008-09-18 Thread Geoff Hopson
As per other thread 1) security down to field level Otherwise I am mostly happy that Solr gives me everything that Autonomy does. 2008/9/18 Otis Gospodnetic [EMAIL PROTECTED]: Geoff, Perhaps you can find out the list of features/functionalities that your project requires and we can give

AW: Date field mystery

2008-09-18 Thread Kolodziej Christian
Hi Chris, it was a long night for our solr server today because we rebuilt the complete index using well formed date string. And the date field is stored now so that we can see if there went something wrong :-) But our problems are solved completely. Now I can give you a very exact

Re: cron job update index

2008-09-18 Thread sunnyfr
Ok Thanks it's very clear. Just do you know why my cron job doesn't work : # m h dom mon dow command */5 * * * * /usr/bin/wget http://solr-test.books.com:8080/solr/books/dataimport?command=delta-import When I go to check the date in conf/dataimport.properties, the date and hour doesn't

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess the post is not sending the correct 'wt' parameter. try setting wt=javabin explicitly . wt=xml may not work because the parser still is binary. check this http://wiki.apache.org/solr/Solrj#xmlparser On Thu, Sep 18, 2008 at 11:49 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: A quick

Re: recip(myfield,m,a,b)

2008-09-18 Thread sunnyfr
I don't think it can works at the index time, because I when somebody look for a book I want to boost the search in relation with the user language ...so I dont think it can works, except if I didn't get it. Thanks for your answer, hossman wrote: : Is there a way to convert to integer to

Re: Special character matching 'x' ?

2008-09-18 Thread Sanjay Suri
Thanks Akshay and Norberto, I am still trying to make it work. I know the solution is what you pointed me to but is just taking me some time to make it work. thanks, -Sanjay On Thu, Sep 18, 2008 at 12:34 PM, Norberto Meijome [EMAIL PROTECTED]wrote: On Thu, 18 Sep 2008 10:53:39 +0530 Sanjay

Unable to filter fq param on a dynamic field

2008-09-18 Thread Barry Harding
Hi, I have a fairly simple solr setup with several predefined fields that are indexed and stored and also depending on the type of product I also add various dynamic fields of type string to a record, and I should mention that I am using the solr.DisMaxRequestHandler request handler called

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
Hi Yonik, One approach I have been working on that I will integrate into SOLR is the ability to use serialized objects for the analyzers so that the schema can be defined on the client side if need be. The analyzer classes will be dynamically loaded. Or there is no need for a schema and plain

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
This should be done. Great idea. On Wed, Sep 17, 2008 at 3:41 PM, Lance Norskog [EMAIL PROTECTED] wrote: My vote is for dynamically scanning a directory of configuration files. When a new one appears, or an existing file is touched, load it. When a configuration disappears, unload it. This

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
That would allow a single request to see a stable view of the schema, while preventing having to make every aspect of the schema thread-safe. Yes that is the best approach. Nothing will stop one from using java serialization for config persistence, Persistence should not be serialized.

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
Servlets is one thing. For SOLR the situation is different. There are always small changes people want to make, a new stop word, a small tweak to an analyzer. Rebooting the server for these should not be necessary. Ideally this is handled via a centralized console and deployed over the network

Re: Some new SOLR features

2008-09-18 Thread Mark Miller
Dynamic changes are not what I'm against...I'm against dynamic changes that are triggered by the app noticing that the config have changed. Jason Rutherglen wrote: Servlets is one thing. For SOLR the situation is different. There are always small changes people want to make, a new stop word,

delt-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
This XML file does not appear to have any style information associated with it. The document tree is shown below. − response − lst name=responseHeader int name=status0/int int name=QTime0/int /lst − lst name=initArgs − lst name=defaults str name=configdata-config.xml/str /lst /lst

Re: Some new SOLR features

2008-09-18 Thread Jason Rutherglen
Yes, so it's probably best to make the changes through a remote interface so that the app will be able to make the appropriate internal changes. File based system changes are less than ideal, agreed, however I suppose with an open source project such as SOLR the kitchen sink affect happens and it

Re: problem index accented character with release version of solr 1.3

2008-09-18 Thread Sean Timm
From the XML 1.0 spec.: Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646. So, \005 is not a legal XML character. It appears the old StAX implementation was more lenient than it should have been and Woodstox is doing the

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
It was too long so I finally restart tomcat .. then 5mn later my cron job started : but it looks like nothing happening by cron job : This is my OUTPUT file : tot.txt ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime0/int/lstlst

Re: Solr vs Autonomy

2008-09-18 Thread Walter Underwood
It depends entirely on the needs of the project. For some things, Solr is superior to Autonomy, for other things, not. I used to work at Autonomy (and Verity and Inktomi and Infoseek), and I chose Solr for Netflix. It is working great for us. wunder == Walter Underwood Former Ultraseek Architect

Re: Solr vs Autonomy

2008-09-18 Thread Geoff Hopson
My project is looking to index 10s of millions of documents, providing search across a live-live environment (hence index distribution/replication is important). Most searches have to be done (ie to end user) in 5 seconds or less. The index has about 30 fields, and I reckon that the security

Re: No server response code on insert: how do I avoid this at high speed?

2008-09-18 Thread Paleo Tek
Otis Gospodnetic wrote: Perhaps the container logs explain what happened? How about just throttling to the point where the failure rate is 0%? Too slow? Otis's questions regarding dropped inserts sent me back to the drawing board. The system had been tuned to a slower database to

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread Shalin Shekhar Mangar
Hit /dataimport again from a browser and refresh periodically to see the progress (number of documents indexed). On Thu, Sep 18, 2008 at 7:55 PM, sunnyfr [EMAIL PROTECTED] wrote: It was too long so I finally restart tomcat .. then 5mn later my cron job started : but it looks like nothing

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
It is exactly what I've done but it can't works like that ... - what would that mean ... cron job can't hit it properly ? - I've browse to /dataimport but it was like nothing was running so I finally went back to /dataimport?command=delta-import and then to /dataimport and I refresh it

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
It is exactly what I've done but it can't works like that ... - what would that mean ... cron job can't hit it properly ? - I've browse to /dataimport but it was like nothing was running so I finally went back to /dataimport?command=delta-import and then to /dataimport and I refresh it

Re: Solr vs Autonomy

2008-09-18 Thread Ryan McKinley
On Sep 18, 2008, at 3:23 AM, Geoff Hopson wrote: As per other thread 1) security down to field level how complex of a security model do you need? Is each users field visibility totally distinct? are there a few basic groups? If you are willing to write (or hire someone to write) a

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread Shalin Shekhar Mangar
Well it shows the number of documents that have changed, you can't expect 1603970 documents to be indexed instantly. On Thu, Sep 18, 2008 at 8:24 PM, sunnyfr [EMAIL PROTECTED] wrote: It is exactly what I've done but it can't works like that ... - what would that mean ... cron job can't

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
I agree about that but the last time 4hours later the number wasn't different : and if I check now, nothing changed : does it have to go across all the data like full import, I thought it would bring back just ids which need to be modify ...? lst name=statusMessages str name=Time

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread Shalin Shekhar Mangar
On Thu, Sep 18, 2008 at 8:45 PM, sunnyfr [EMAIL PROTECTED] wrote: I agree about that but the last time 4hours later the number wasn't different : Do you mean that the number doesn't change at all on refreshing the page? Can you check the solr log file for exceptions? I suspect that you may

Re: Solr vs Autonomy

2008-09-18 Thread Walter Underwood
I would do the field visibility one layer up from the search engine. That layer already knows about the user and can request the appropriate fields. Or request them all (better HTTP caching) and only show the appropriate ones. As I understand your application, putting access control in Solr

Re: delta-import looks stuck ???? how can I check if it's done or not ?

2008-09-18 Thread sunnyfr
this is my log file : [EMAIL PROTECTED]:/home# tail -f /var/log/tomcat5.5/catalina.$(date +%Y-%m-%d).log Sep 18, 2008 5:25:02 PM org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO: Creating a connection for entity books with URL: jdbc:mysql://master-spare.vip.books.com/books Sep 18,

Re: Hardware config for SOLR

2008-09-18 Thread Matthew Runo
I can't speak to a lot of this - but regarding the servers I'd go with the more powerful ones, if only for the amount of ram. Your index will likely be larger than 1 gig, and with only two you'll have a lot of your index not stored in ram, which will slow down your QPS. Thanks for your

RE: Solr vs Autonomy

2008-09-18 Thread Kashyap, Raghu
Hi Geoff, I cannot vouch for Autonomy however, earlier this year we did evaluate Endeca Solr and we went with Solr some of the reasons were: 1. Freedom of open source with Solr 2. Very good active solr open source community 3. Features pretty much overlap with both solr Endeca 4. Endeca

Re: Unable to filter fq param on a dynamic field

2008-09-18 Thread Otis Gospodnetic
Barry, does this return the correct hits: http://127.0.0.1:8080/apache-solr-1.3.0/IvolutionSearch?q=Output-Type-facet:Monochrome Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Barry Harding [EMAIL PROTECTED] To:

Re: AW: Date field mystery

2008-09-18 Thread Otis Gospodnetic
Hi Christian, While I can't tell you whether the problem with - will be solved when you try it on 1.3, I can tell you that you should probably trim your dates so they are not as fine as you currently have them, unless you need such precision. We need to add this to the FAQ. :) Otis --

Re: Field level security

2008-09-18 Thread Otis Gospodnetic
Hi, If all you have to do is hide certain fields from search results for some users, then your application -- the application that sends search requests to Solr can just use different fl=XXX parameters based on user's permission. I think that's all you need and the custom fieldType should

Re: Solr vs Autonomy

2008-09-18 Thread Otis Gospodnetic
Geoff, In short: all items that you listed are not a problem for Solr. Indices can be sharded, distributed search is possible, custom ranking is possible, 30 fields is possible, etc. etc. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From:

snapshot.yyyymmdd ... can't found them?

2008-09-18 Thread sunnyfr
Hi sorry I think I've started properly rsyncd : [EMAIL PROTECTED]:/# ./data/solr/books/bin/rsyncd-enable [EMAIL PROTECTED]:/# ./data/books/video/bin/rsyncd-start but then I can't found this snapshot.current files ?? How can I check I did it properly ? my rsyncd.log : 2008/09/18

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread syoung
I tried setting the 'wt' parameter to both 'xml' and 'javabin'. Neither worked. However, setting the parser on the server to XMLResponseParser did fix the problem. Thanks for the help. Susan Noble Paul നോബിള്‍ नोब्ळ् wrote: I guess the post is not sending the correct 'wt' parameter. try

RE: Unable to filter fq param on a dynamic field

2008-09-18 Thread Barry Harding
Hi Otis, no that does not seem to bring back the correct results either in fact its still zero results. Its also not bringing back results if I use the standard handler http://127.0.0.1:8080/apache-solr-1.3.0/select?q=Output-Type-facet:Monochrome but the field is visible in the documents

Re: Dismax + Dynamic fields

2008-09-18 Thread Jon Drukman
Daniel Papasian wrote: Norberto Meijome wrote: Thanks Yonik. ok, that matches what I've seen - if i know the actual name of the field I'm after, I can use it in a query it, but i can't use the dynamic_field_name_* (with wildcard) in the config. Is adding support for this something that is

Re: Unable to filter fq param on a dynamic field

2008-09-18 Thread Otis Gospodnetic
Barry, You are seeing the value of the field as it was saved (as the original), but perhaps something is funky with how it was analyzed/tokenized at search time and how it is being analyzed now at query time. Double-check your fieldType/analysis settings for this field and make sure you are

RE: Searching for future or null dates

2008-09-18 Thread Chris Maxwell
Here is what I was able to get working with your help. (productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR ((*:* -endDate:[* TO *]))) the *:* is what I was missing. Thanks for your help. hossman wrote: : If the query stars with a negative clause Lucene returns

Re: Filtering results

2008-09-18 Thread ristretto . rb
Otis, Would be reasonable to run a query like this http://localhost:8280/solr/select/?q=terms_xversion=2.2start=0rows=0indent=on 10 times, one for each result from an initial category query on a different index. So, it's still 1+10, but I'm not returning values. This would give me the number

RE: Hardware config for SOLR

2008-09-18 Thread Andrey Shulinskiy
Matthew, Thanks, a very good point. Andrey. -Original Message- From: Matthew Runo [mailto:[EMAIL PROTECTED] Sent: Thursday, September 18, 2008 11:38 AM To: solr-user@lucene.apache.org Subject: Re: Hardware config for SOLR I can't speak to a lot of this - but regarding the

firstSearcher and newSearcher events

2008-09-18 Thread oleg_gnatovskiy
Hello. I am using the spellcheck component (https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker index is kept in RAM, it gets erased every time the Solr server gets restarted. I was thinking of using either the firstSearcher or the newSearcher to reload the index every time

Re: firstSearcher and newSearcher events

2008-09-18 Thread Shalin Shekhar Mangar
On Fri, Sep 19, 2008 at 5:55 AM, oleg_gnatovskiy [EMAIL PROTECTED] wrote: Hello. I am using the spellcheck component (https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker index is kept in RAM, it gets erased every time the Solr server gets restarted. I was thinking of

Re: Filtering results

2008-09-18 Thread Otis Gospodnetic
Gene, I haven't looked at Field Collapsing for a while, but if you have a single index and collapse hits on your category field, then won't first 10 hits be items you are looking for - top 1 item for each category x 10 using a single query. Otis -- Sematext -- http://sematext.com/ -- Lucene -

error when post xml data to solr

2008-09-18 Thread 李学健
hi, all when i post an xml file to solr, some errors happen as below: == com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog at [row,col {unknown-source}]: [1,0] at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686) at

Re: Filtering results

2008-09-18 Thread ristretto . rb
Thanks Otis for reply! Always appreciated! That is indeed what we are looking for implementing. But, I'm running out of time to prototype or experiment for this release. I'm going to run the two index thing for now, unless I find something saying is really easy and sensible to run one and

Can I add custom fields to the input XML file?

2008-09-18 Thread convoyer
Hi guys. Is the XML format for inputting data, is a standard one? or can I change it. That is instead of : adddoc field name=id3007WFP/field field name=nameDell Widescreen UltraSharp 3007WFP/field field name=manuDell, Inc./field /doc/add can I enter something like, custListclients field

Re: Setting request method to post on SolrQuery causes ClassCastException

2008-09-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is surprising as to why this happens the the javabin offers significant perf improvements over the xml one. probably you can also try this requestHandler name=/search class=org.apache.solr.handler.component.SearchHandler lst name=defaults str name=wtjavabin/str /lst /requestHandler