[jira] Commented: (SOLR-1630) StringIndexOutOfBoundsException in SpellCheckComponent

2010-01-15 Thread Ralf Kraus (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800622#action_12800622 ] Ralf Kraus commented on SOLR-1630: -- We have found an hint to the problem: We run into into

Re: SolrCloud logical shards

2010-01-15 Thread Uri Boness
Can you elaborate on what you mean, isn't a core a single index too? It seems like shard was used to represent a remote index (perhaps?). Yes, a core is a single index and a shard is a conceptual idea which at the moment concretely refers to a remote core (but not a specific one as the same

[jira] Assigned: (SOLR-1721) Add explicit option to run DataImportHandler in synchronous mode

2010-01-15 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-1721: Assignee: Noble Paul Add explicit option to run DataImportHandler in synchronous mode

[jira] Resolved: (SOLR-1696) Deprecate old highlighting syntax and move configuration to HighlightComponent

2010-01-15 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-1696. -- Resolution: Fixed committed r899572 Deprecate old highlighting syntax and move configuration to

[jira] Resolved: (SOLR-1721) Add explicit option to run DataImportHandler in synchronous mode

2010-01-15 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-1721. -- Resolution: Fixed committed r899580 Thanks Alexy Add explicit option to run DataImportHandler in

[jira] Created: (SOLR-1723) VelocityResponseWriter view enhancement ideas

2010-01-15 Thread Erik Hatcher (JIRA)
VelocityResponseWriter view enhancement ideas - Key: SOLR-1723 URL: https://issues.apache.org/jira/browse/SOLR-1723 Project: Solr Issue Type: Improvement Components: Response Writers

[jira] Commented: (SOLR-1723) VelocityResponseWriter view enhancement ideas

2010-01-15 Thread Erik Hatcher (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800685#action_12800685 ] Erik Hatcher commented on SOLR-1723: Clean up debug.vm to make a collapsible tree view

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800720#action_12800720 ] Grant Ingersoll commented on SOLR-1301: --- Seems like this would make the most sense as

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800723#action_12800723 ] Grant Ingersoll commented on SOLR-1301: --- bq. Furthermore, by using an

Re: SolrCloud logical shards

2010-01-15 Thread Yonik Seeley
On Thu, Jan 14, 2010 at 1:38 PM, Ted Dunning ted.dunn...@gmail.com wrote: I think that most of these complications go away to a remarkable degree if you combine katta style random assignment of small shards. The major simplifications there include: - no need to move individual documents, nor

Re: [jira] Created: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Marc Sturlese
Hey there, I just have started using hadoop to create Lucene/Solr indexes. Have couple of questions. I have seen there's a hadoop contrib to build a lucene index (org.apache.hadoop.contrib.index). That contrib has a Partitioner to decide for every map output wich reducer to go. It uses

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800746#action_12800746 ] Andrzej Bialecki commented on SOLR-1301: - bq. I'm curious about the not sending

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800748#action_12800748 ] Grant Ingersoll commented on SOLR-1301: --- bq. Hmm, I don't think this would make sense

Re: SolrCloud logical shards

2010-01-15 Thread Jason Rutherglen
The point I was trying to make is that I believe that if you start changing terminologies now people will be very confused So shard - remote core... Slice - core group. Though semantically they're synonyms. In any case, I need to spend some time looking at the cloud branch, and less time

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800756#action_12800756 ] Jason Rutherglen commented on SOLR-1301: Andrzej's model works great in production.

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800760#action_12800760 ] Grant Ingersoll commented on SOLR-1301: --- Don't confuse the ZK stuff for search w/ the

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800758#action_12800758 ] Andrzej Bialecki commented on SOLR-1301: - Iff we somehow could get a mapping

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800775#action_12800775 ] Jason Rutherglen commented on SOLR-1301: {quote}What I meant was the Hadoop job

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800785#action_12800785 ] Grant Ingersoll commented on SOLR-1301: --- I don't follow how sending docs to a suite of

[jira] Resolved: (SOLR-577) added support for boosting fields and documents to python solr interface

2010-01-15 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otis Gospodnetic resolved SOLR-577. --- Resolution: Won't Fix Closing per comment. added support for boosting fields and documents

[jira] Resolved: (SOLR-216) Improvements to solr.py

2010-01-15 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otis Gospodnetic resolved SOLR-216. --- Resolution: Won't Fix Closing per comment Improvements to solr.py ---

[jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800802#action_12800802 ] Jason Rutherglen commented on SOLR-1301: bq. Hadoop streaming the output of the

[jira] Commented: (SOLR-758) Enhance DisMaxQParserPlugin to support full-Solr syntax and to support alternate escaping strategies.

2010-01-15 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800816#action_12800816 ] Otis Gospodnetic commented on SOLR-758: --- I this still needed with enhanced dismax now

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Ted Dunning
This can also be a big performance win. Jason Venner reports significant index and cluster start time improvements by indexing to local disk, zipping and then uploading the resulting zip file. Hadoop has significant file open overhead so moving one zip file wins big over many index component

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Andrzej Bialecki
On 2010-01-15 20:13, Ted Dunning wrote: This can also be a big performance win. Jason Venner reports significant index and cluster start time improvements by indexing to local disk, zipping and then uploading the resulting zip file. Hadoop has significant file open overhead so moving one zip

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Jason Rutherglen
Zipping cores/shards is in the latest patch... On Fri, Jan 15, 2010 at 11:22 AM, Andrzej Bialecki a...@getopt.org wrote: On 2010-01-15 20:13, Ted Dunning wrote: This can also be a big performance win.  Jason Venner reports significant index and cluster start time improvements by indexing to

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll
I can see why that is a win over the existing, but I still don't get why it wouldn't be faster just to index to a suite of Solr master indexers and save all this file slogging around. But, I guess that is a separate patch all together. On Jan 15, 2010, at 2:35 PM, Jason Rutherglen wrote:

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Ted Dunning
The reason I would a major speed win when expect indexing to local disk and copying later is that you get much more efficient reading of documents with normal hadoop mechanisms. Throwing documents to the various Solr master indexers is bound to be slower than having 20 machines reading at local

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Jason Rutherglen
Copying files ala HDFS is trivial because it's sequential, Lucene merging isn't, so scaling merging over 20 machines vs 4 Solr has clear advantages... That and on-demand expandability, so I can reindex 2 terabytes of data in half a day vs weeks or more with 4 Solr masters has compelling

Solr Cloud wiki and branch notes

2010-01-15 Thread Jason Rutherglen
Here's some rough notes after running the unit tests, reviewing some of the code (though not understanding it), and reviewing the wiki page http://wiki.apache.org/solr/SolrCloud We need a protocol in the URL, otherwise it's inflexible I'm overwhelmed with all the ?? question areas of the

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Yonik Seeley
On Fri, Jan 15, 2010 at 4:12 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: The page is huge, which signals to me maybe we're trying to do too much This is really about doing not-so-much in the very near term, while thinking ahead to the longer term. Revamping distributed search could

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Ted Dunning
We index comparable amounts of data in a few hours. On Fri, Jan 15, 2010 at 1:08 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: That and on-demand expandability, so I can reindex 2 terabytes of data in half a day vs weeks or more with 4 Solr masters has compelling advantages. --

Re: [jira] Commented: (SOLR-1301) Solr + Hadoop

2010-01-15 Thread Grant Ingersoll
Makes sense. Interesting exercise to think about. On Jan 15, 2010, at 4:08 PM, Jason Rutherglen wrote: Copying files ala HDFS is trivial because it's sequential, Lucene merging isn't, so scaling merging over 20 machines vs 4 Solr has clear advantages... That and on-demand expandability, so I

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Jason Rutherglen
This is really about doing not-so-much in the very near term, while thinking ahead to the longer term. Lets have a page dedicated to release 1.0 of cloud? I feel uncomfortable editing the existing wiki because I don't know what the plans are for the first release. I need to revisit Katta as my

[jira] Created: (SOLR-1724) Real Basic Core Management with Zookeeper

2010-01-15 Thread Jason Rutherglen (JIRA)
Real Basic Core Management with Zookeeper - Key: SOLR-1724 URL: https://issues.apache.org/jira/browse/SOLR-1724 Project: Solr Issue Type: New Feature Components: multicore Affects

[jira] Commented: (SOLR-1724) Real Basic Core Management with Zookeeper

2010-01-15 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800994#action_12800994 ] Jason Rutherglen commented on SOLR-1724: Additionally, upon successful completion of

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Andrzej Bialecki
Hi, My 0.02 PLN on the subject ... Terminology --- First the terminology: reading your emails I have a feeling that my head is about to explode. We have to agree on the vocabulary, otherwise we have no hope of reaching any consensus. I propose the following vocabulary that has been

[jira] Commented: (SOLR-1724) Real Basic Core Management with Zookeeper

2010-01-15 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801051#action_12801051 ] Ted Dunning commented on SOLR-1724: --- Katta had some interesting issues in the design of

Re: Solr Cloud wiki and branch notes

2010-01-15 Thread Ted Dunning
On Fri, Jan 15, 2010 at 4:36 PM, Andrzej Bialecki a...@getopt.org wrote: My 0.02 PLN on the subject ... Polish currency seems pretty strong lately. There are a lot of good ideas for this small sum. Terminology * (global) search index * index shard: * partitioning: * search node: *

Planned release date for 1.5 with SOLR-236 fixed?

2010-01-15 Thread Kelly Taylor
Would anybody happen to know the planned release date for 1.5? And if so, whether or not the final fix for SOLR-236 will be included. -Kelly -- View this message in context: http://old.nabble.com/Planned-release-date-for-1.5-with-SOLR-236-fixed--tp27186780p27186780.html Sent from the Solr -

[jira] Commented: (SOLR-1553) extended dismax query parser

2010-01-15 Thread David Smiley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801112#action_12801112 ] David Smiley commented on SOLR-1553: Yonik (or someone else I guess), would you mind