Re: lucene-java version mismatches

2009-03-25 Thread Paul Libbrecht
could I suggest that the maven repositories are populated next-time a release of solr-specific-lucenes are made? But they are? It is inside the org.apache.solr group since those lucene jars are released by Solr -- http://repo2.maven.org/maven2/org/apache/solr/ Nope,

Status of an update request

2009-03-25 Thread Pierre-Yves LANDRON
Hello, When I send an update or a commit to solr via curl, the response I get is formated in HTML ; I can't find a way to have a machine readable response file. Here what is said on the subject in the solr config file : The response format differs from solr1.1 formatting and returns a standard

Re: lucene-java version mismatches

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 12:30 PM, Paul Libbrecht p...@activemath.orgwrote: could I suggest that the maven repositories are populated next-time a release of solr-specific-lucenes are made? But they are? It is inside the org.apache.solr group since those lucene jars are released by Solr --

Anyone use solr admin and Opera?

2009-03-25 Thread ristretto.rb
Hello, I'm a happy Solr user. Thanks for the excellent software!! Hopefully this is a good question, I have indeed looked around the FAQ and google and such first. I have just switched from Firefox to Opera for web browsing. (Another story) When I use the solr/admin the home page and stats

numeric range facets

2009-03-25 Thread Ashish P
Similar to getting range facets for date where we specify start, end and gap. Can we do the same thing for numeric facets where we specify start, end and gap. -- View this message in context: http://www.nabble.com/numeric-range-facets-tp22698330p22698330.html Sent from the Solr - User mailing

Re: get all facets

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 7:30 AM, Ashish P ashish.ping...@gmail.com wrote: Can I get all the facets in QueryResponse?? You can get all the facets that are returned by the server. Set facet.limit to the number of facets you want to retrieve. See

Re: numeric range facets

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 3:26 PM, Ashish P ashish.ping...@gmail.com wrote: Similar to getting range facets for date where we specify start, end and gap. Can we do the same thing for numeric facets where we specify start, end and gap. No. But you can do this with multiple queries by using

Re: Anyone use solr admin and Opera?

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 1:33 PM, ristretto.rb ristretto...@gmail.comwrote: Hello, I'm a happy Solr user. Thanks for the excellent software!! Hopefully this is a good question, I have indeed looked around the FAQ and google and such first. I have just switched from Firefox to Opera for web

Re: Status of an update request

2009-03-25 Thread Shalin Shekhar Mangar
On Wed, Mar 25, 2009 at 12:42 PM, Pierre-Yves LANDRON pland...@hotmail.comwrote: Hello, When I send an update or a commit to solr via curl, the response I get is formated in HTML ; I can't find a way to have a machine readable response file. Here what is said on the subject in the solr

Deleting documents

2009-03-25 Thread Rui Pereira
I'm trying to delete documents based on the following type of update requests: deletequerytopologyid:3140/queryquerytopologyid:3142/query/delete This doesn't cause any changes on index and if I try to read the response, the following error ocurs: 13:32:35,196 ERROR [STDERR] 25/Mar/2009 13:32:35

Copy solr indexes from 2 solr instance

2009-03-25 Thread prerna07
Hi, Issue 1: I have 2 solr instances, i need to copy indexes from solr1 instance to solr2 without restarting the solr. Please suggest how will this work. Both solr are on multicore setup. Issue2: I deleted all indexes from solr and reloaded my core, solr admin return 0 results. The size of

speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
hi, I'm having difficulty indexing a collection of documents in a reasonable time. it's now going at 20 docs / sec on a c1.xlarge instance of amazon ec2 which just isnt enough. This box has 8GB ram and the equivalent of 20 xeon processors. these document have a couple of stored, indexed,

Re: speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Otis Gospodnetic
Britske, Here are a few quick ones: - Does that machine really have 10 CPU cores? If it has significantly less, you may be beyond the indexing sweet spot in terms of indexer threads vs. CPU cores - Your maxBufferedDocs is super small. Comment that out anyway. use ramBufferedSizeMB and

Re: Copy solr indexes from 2 solr instance

2009-03-25 Thread Otis Gospodnetic
Prerna, You could create an index snapshot with snapshooter script and then copy the index. You should do that while the source index is not getting modified. Re issue #2: run optimize Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From:

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Otis Gospodnetic
Hm, I can't quite tell from here, but that is just a warning, so it's not super problematic at this point. Could it be that one of your other caches (query cache) is large and lots of items are copied on searcher flip? Could it be that your JVM doesn't have large or free enough enough heap?

Re: Not able to configure multicore

2009-03-25 Thread Otis Gospodnetic
Hm, where does that /solr2 come from? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: mitulpatel mitulpa...@greymatterindia.com To: solr-user@lucene.apache.org Sent: Wednesday, March 25, 2009 12:30:11 AM Subject: Re: Not able to

Re: Hardware Questions...

2009-03-25 Thread Otis Gospodnetic
Ah, it's hard to tell. I look at index size on disk, number of docs, query rate, types of queries, etc. Are you actually seeing problems with your existing servers? Or see specific performance movement in one of the aspects? (e.g. increasing latency, increased GC or memory usage, increased

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Ryan McKinley
I don't understand why this sometimes takes two minutes between the start commit /update and sometimes takes 20 minutes? One of our caches has about ~40,000 items, but I can't imagine it taking 20 minutes to autowarm a searcher. What do your cache configs look like? How big is the

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Cloude Porteus
Yes, I guess I'm running 40k queries when it starts :) I didn't know that each count was equal to a query. I thought it was just copying the cache entries from the previous searcher, but I guess that wouldn't include new entries. I set it to the size of our filterCache. What should I set the the

Strange anomaly(?) with string matching in query

2009-03-25 Thread Kurt Nordstrom
Hello, We've encountered a strange issue in our Solr install regarding a particular string that just doesn't seem to want to return results, despite the exact same string being in the index. What makes it even stranger is that we had the same data in a previous install of Solr, and it worked

Re: speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
Thanks for the quick reply. the box has 8 real cpu's. Perhaps a good idea then to reduce the nr of cores to 8 as well. I'm testing out a different scenario with multiple boxes as well, where clients persist docs to multiple cores on multiple boxes. (which is what multicore was invented for after

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Otis Gospodnetic
Hi, Take the whole string to your Solr Admin - Analysis page and analyze it. Does it get analyzed the way you'd expect it to be analyzed? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kurt Nordstrom knordst...@library.unt.edu To:

REST interface for Query

2009-03-25 Thread Olson, Curtis B
Greetings, I am a new subscriber. I'm Curtis Olson and I work for CACI under contract at the U.S. Department of State, where we deal with massive quantities of documents, so Solr is ideal for us. We have a good sized index that we are starting to build up in development. Some of the filter

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Ryan McKinley
It looks like the cache is configured big enough, but the autowarm count is too big to have good performance. Try something smaller and see if that fixes both problems. I imagine even just warming the most recent 100 queries would precache the most important ones, but try some higher

How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
Hi list, I've finally settled on Solr, seeing as it has almost everything I could want out of the box. My setup is a complicated one. It will serve as the search backend on Bitbucket.org, a mercurial hosting site. We have literally thousands of code repositories, as well as users and other data.

Re: REST interface for Query

2009-03-25 Thread Otis Gospodnetic
Curtis, Like this? https://issues.apache.org/jira/browse/SOLR-839 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Olson, Curtis B olso...@state.gov To: solr-user@lucene.apache.org Sent: Wednesday, March 25, 2009 12:28:35 PM Subject:

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Eric Pugh
You could index the user name or ID, and then in your application add as filter the username as you pass the query back to Solr. Maybe have a access_type that is Public or Private, and then for public searches only include the ones that meet the access_type of Public. Eric On Mar 25,

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
On Wed, Mar 25, 2009 at 5:57 PM, Eric Pugh ep...@opensourceconnections.com wrote: You could index the user name or ID, and then in your application add as filter the username as you pass the query back to Solr.  Maybe have a access_type that is Public or Private, and then for public searches

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
you can even create separated indexes for private or public access if u need (and place them in separated machines), but i think Eric's suggestion is the best and easier On Wed, Mar 25, 2009 at 5:52 PM, Jesper Nøhr jno...@gmail.com wrote: Hi list, I've finally settled on Solr, seeing as it

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Kurt Nordstrom
Otis: Okay, I'm not sure whether I should be including the quotes in the query when using the analyzer, so I've run it both ways (no quotes on the index value). I'll try to approximate the final tables returned for each term: The field is dc_subject in both cases, being of type text ***

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
i can't see the problem about that. you can manage your users using a DB and keep there the permissions they could have, and create or erase users without problems. you just have to manage a working index field for each user with repositories' ids he can access. or u can create several indexes and

RE: REST interface for Query

2009-03-25 Thread Olson, Curtis B
Otis, that very much looks like what I'm after. Curtis -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, March 25, 2009 12:53 PM To: solr-user@lucene.apache.org Subject: Re: REST interface for Query Curtis, Like this?

getting started

2009-03-25 Thread nga pham
Hi Some of the getting started link dont work. Can you please enable it?

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
Hm, I must be missing something, then. Consider this. There are three repositories, A and B, C. There are two users, U1 and U2. Repository A is public, while B and C are private. Only U1 can access B. No one can access C. I index this data, such that Is_Private is true for B. Now, when U2

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
ok so u can create a table in a DB where you have a row foreach user and a field with the reps he/she can access. Then you just have to take a look on the db and include the repository name in the index. so you just have to control (using query parameters) if the query is done for the right reps

Re: Strange anomaly(?) with string matching in query

2009-03-25 Thread Kurt Nordstrom
Otis, Absolutely. Here are the tokenizers and filters for the text fieldtype in the schema. http://pastebin.com/f2bb249f3 Thanks! That's what I suspected. Want to paste the relevant tokenizer+filters sections of your schema? The index-time and query-time analysis has to be the same or

Re: getting started

2009-03-25 Thread Erick Erickson
Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote: Hi Some of the getting started link dont work. Can you please enable it?

Re: getting started

2009-03-25 Thread nga pham
Oops my mistake. Sorry for the trouble On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson erickerick...@gmail.comwrote: Which links? Please be as specific as possible. Erick On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote: Hi Some of the getting started link dont

Can TermIndexInterval be set in Solr?

2009-03-25 Thread Burton-West, Tom
Hello all, We are experimenting with the ShingleFilter with a very large document set (1 million full-text books). Because the ShingleFilter indexes every word pair as a token, the number of unique terms increases tremendously. In our experiments so far the tii and tis files are getting very

Re: getting started

2009-03-25 Thread nga pham
http://lucene.apache.org/solr/tutorial.html#Getting+Started link - lucene QueryParser syntax is not working On Wed, Mar 25, 2009 at 10:48 AM, nga pham nga.p...@gmail.com wrote: Oops my mistake. Sorry for the trouble On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson

Re: Realtime Searching..

2009-03-25 Thread John Wang
Hi Jon: We are running various LinkedIn search systems on Zoie in production. -John On Thu, Feb 19, 2009 at 9:11 AM, Jon Baer jonb...@gmail.com wrote: This part: The part of Zoie that enables real-time searchability is the fact that ZoieSystem contains three IndexDataLoader objects:

Re: getting started

2009-03-25 Thread Erick Erickson
OK, now I'll turn it over to the folks who actually maintain that site G. Meanwhile, here's the link to the 2.4.1 query syntax. http://lucene.apache.org/java/2_4_1/queryparsersyntax.html Best Erick On Wed, Mar 25, 2009 at 2:00 PM, nga pham nga.p...@gmail.com wrote:

Solr OpenBitSet OutofMemory Error

2009-03-25 Thread smock
Hello, After running a nightly release from around January of Solr for about 4 weeks without any problems, I'm starting to see OutofMemory errors: Mar 24, 2009 1:35:36 AM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space at

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Jesper Nøhr
OK, we're getting closer. I just have two final questions regarding this then: 1. This would also include all the public repositories, right? If so, how would such a query look? Some kind of is_public:true AND ...? 2. When a repository is made public, the is_public property in the Solr index

Re: Can TermIndexInterval be set in Solr?

2009-03-25 Thread Otis Gospodnetic
I think it's the later. I don't think the term interval is exposed anywhere. If you expose it through the config and provide a patch, I think we can add this to the core quickly. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From:

Re: Realtime Searching..

2009-03-25 Thread Otis Gospodnetic
Would it not make more sense to wait for the Lucene's IW+IR marriage and other things happening in core Lucene that will make near-real-time search possible? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: John Wang john.w...@gmail.com

SRW/U and OAI-PMH servers over solr

2009-03-25 Thread Miguel Coxo
Hello there, I'm looking for a way to implement SRW/U and a OAI-PMH servers over solr, similar to what i have found here: http://marc.info/?l=solr-devm=116405019011211w=2 . Well actually if it is decoupled (not a plugin) would be ok, if not better =). I wanted to know if anyone knows if there is

Partition index by time using Solr

2009-03-25 Thread vivek sar
Hi, I've used Lucene before, but new to Solr. I've gone through the mailing list, but unable to find any clear idea on how to partition Solr indexes. Here is what we want, 1) Be able to partition indexes by timestamp - basically partition per day (create a new index directory every day)

Re: How do I accomplish this (semi-)complicated setup?

2009-03-25 Thread Alejandro Gonzalez
try using db for permission management and when u want to make a rep public u just have to add it's id or name to everyuser permissions field. i think you don't need to add any is_public field to index, just an id or name field in wich the indexed doc is.So you can pre-filter the reps quering the

Re: Delta import

2009-03-25 Thread AlexxelA
Yes my database is remote, mysql 5 and i'm using connector/J 5.1.7. My index has 2 documents. When i try to do lets say 14 updates it takes about 18 sec total. Here's the resulting log of the operation : 2009-03-25 15:53:57 org.apache.solr.handler.dataimport.JdbcDataSource$1 call INFO:

Re: Snapinstaller + Overlapping onDeckSearchers Problems

2009-03-25 Thread Cloude Porteus
I set the autowarm to 2000, which only takes about two minutes and resolves my issues. Thanks for your help! best, cloude On Wed, Mar 25, 2009 at 9:34 AM, Ryan McKinley ryan...@gmail.com wrote: It looks like the cache is configured big enough, but the autowarm count is too big to have good

Re: SRW/U and OAI-PMH servers over solr

2009-03-25 Thread Ryan McKinley
I implemented OAI-PMH for solr a few years back for the Massachusetts library system... it appears not to be running right now, but check... http://www.digitalcommonwealth.org/ It would be great to get that code revived and live open source somewhere. As is, it uses a pre 1.3 release

large index vs multicore

2009-03-25 Thread Manepalli, Kalyan
Hi All, In my project, I have one primary core containing all the basic information for a product. Now I need to add additional information which will be searched and displayed in conjunction with the product results. My question is - From design and query speed point of - should I

solr_hostname in scripts.conf

2009-03-25 Thread Garafola Timothy
I've a question. Is it safe to use 'localhost' as solr_hostname in scripts.conf? -- -Tim

Re: get all facets

2009-03-25 Thread Ashish P
Actually what I meant was if there are 100 indexed fields. So there are 100 facet fields right.. So whenever I create solrQuery, I have to do addFacetField(fieldName) can I avoid this and just get all facet fields. Sorry for the confusion. Thanks again, Ashish Shalin Shekhar Mangar wrote:

Re: large index vs multicore

2009-03-25 Thread Ryan McKinley
My question is - From design and query speed point of - should I add new core to handle the additional data or should I add the data to the existing core. Do you ever need to get results from both sets of data in the same query? If so, putting them in the same index will be faster.

Re: large index vs multicore

2009-03-25 Thread Otis Gospodnetic
Hi, Without knowing the details, I'd say keep it in the same index if the additional information shares some/enough fields with the main product data and separately if it's sufficiently distinct (this also means 2 queries and manual merging/joining). Otis -- Sematext -- http://sematext.com/

Re: Solr OpenBitSet OutofMemory Error

2009-03-25 Thread Otis Gospodnetic
Hi, I'm not sure if anyone will be able to help without more detail. First suggestion would be to look at Solr with a debugger/profiler to see where memory is used up. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: smock

Re: Delta import

2009-03-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Alex , you may be able to use CachedSqlEntityprocessor. you can do delta-import using full-import http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta the inner entity can use a CachedSqlEntityProcessor On Thu, Mar 26, 2009 at 1:45 AM, AlexxelA alexandre.boudrea...@canoe.ca

Re: Not able to configure multicore

2009-03-25 Thread mitulpatel
Actually solr2 is an application other then default one(example) on which I have configured my application. let me explain things more in details: so my application path is http://localhost:8983/solr2/admin and I would like to configure it for multi-cores so I have placed solr.xml in config

Scheduling DIH

2009-03-25 Thread Tricia Williams
Hello, Is there a best way to schedule the DataImportHandler? The idea being to schedule a delta-import every Sunday morning at 7am or perhaps every hour without human intervention. Writing a cron job to do this wouldn't be difficult. I'm just wondering is this a built in feature?

Re: Scheduling DIH

2009-03-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
right now a cron job is the only option. building this into DIH has been a common request? What do others think about this? On Thu, Mar 26, 2009 at 10:11 AM, Tricia Williams williams.tri...@gmail.com wrote: Hello,   Is there a best way to schedule the DataImportHandler?  The idea being to