Re: Index structuring

2008-06-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
For the datasize you are proposing , single index should be fine .Just give the m/c enough RAM Distributed search involves multiple requests made between shards which may be an unncessary overhead. --Noble On Wed, Jun 4, 2008 at 4:02 PM, Ritesh Ambastha [EMAIL PROTECTED] wrote: Thanks Noble,

Re: Index structuring

2008-06-04 Thread Ritesh Ambastha
Thanks Noble, That means, I can go ahead with single Index for long. :) Regards, Ritesh Ambastha Noble Paul നോബിള്‍ नोब्ळ् wrote: For the datasize you are proposing , single index should be fine .Just give the m/c enough RAM Distributed search involves multiple requests made between

Re: Index structuring

2008-06-04 Thread Ritesh Ambastha
Thanks Noble, I maintain two separate indexes on my disk for two different search services. The index size of two are: 91MB and 615MB. I am pretty sure that these index size will grow in future, and may reach 10GB. My doubts : 1. When should I start partitioning my index? 2. Is there any

Re: Index structuring

2008-06-04 Thread Shalin Shekhar Mangar
A lot of this also depends on the number of documents. But we have successfully used Solr with upto 10-12 million documents. On Wed, Jun 4, 2008 at 4:10 PM, Ritesh Ambastha [EMAIL PROTECTED] wrote: Thanks Noble, That means, I can go ahead with single Index for long. :) Regards, Ritesh

Re: Index structuring

2008-06-04 Thread Ritesh Ambastha
The number of docs I have indexed till now is : 1,633,570 I am bit afraid as the number of indexed docs will grow atleast 5-10 times in very near future. Regards, Ritesh Ambastha Shalin Shekhar Mangar wrote: A lot of this also depends on the number of documents. But we have successfully

Re: Index structuring

2008-06-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
Fot 16 mil docs it may not be necessary. Add the shards when you see that perf is degrading. --Noble On Wed, Jun 4, 2008 at 4:17 PM, Ritesh Ambastha [EMAIL PROTECTED] wrote: The number of docs I have indexed till now is : 1,633,570 I am bit afraid as the number of indexed docs will grow

Re: Solrj + Multicore

2008-06-04 Thread Alexander Ramos Jardim
2008/6/3 Ryan McKinley [EMAIL PROTECTED]: This way I don't connect: new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;) this is how you need to connect... otherwise nothing will work. When I try this way, I get the following exception, when trying to make an update to my

1.3 DisMax and MoreLikeThis

2008-06-04 Thread Tom Morton
Hi, I wanted to use the new dismax support for more like this described in SOLR-295 https://issues.apache.org/jira/browse/SOLR-295 but can't even get the new syntax for dismax to work (described in SOLR-281https://issues.apache.org/jira/browse/SOLR-281). Any ideas if this functionality works?

Re: 1.3 DisMax and MoreLikeThis

2008-06-04 Thread Yonik Seeley
On Wed, Jun 4, 2008 at 11:11 AM, Tom Morton [EMAIL PROTECTED] wrote: I wanted to use the new dismax support for more like this described in SOLR-295 https://issues.apache.org/jira/browse/SOLR-295 but can't even get the new syntax for dismax to work (described in

an error after deleting index files

2008-06-04 Thread Nahuel ANGELINETTI
Hi, I'm doing some test with solr to see if it can be usefull for us, and after deleting the index to restart from scratch the indexing, it returns me this error when I want to post datas : java.lang.RuntimeException: java.io.FileNotFoundException: no segments* file found in

Re: Ideas on how to implement sponsored results

2008-06-04 Thread Alexander Ramos Jardim
Cuong, I think you will need some manipulation beyond solr queries. You should separate the results by your site criteria after retrieving them. After that, you could cache the results on your application and randomize the lists every time you render the a page. I don't know if solr has

RE: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-04 Thread Julio Castillo
Noble, Thanks for continuing to assist me on trying to come up a config that works. A couple of questions/clarifications: 1) I had to introduce the artificial comboID and the transformer because of a conflict with a parallel entity on the id (vets and owners). 2) I don't think there is a conflict

Re: 1.3 DisMax and MoreLikeThis

2008-06-04 Thread Tom Morton
Hi, Thanks Yonik. That fixed that. I would be useful to change one of the existing dismax query types in the default solrconfig.xml to use this new syntax (Especially since DisMaxRequestHandler is being deprecared.) Thanks again...Tom On Wed, Jun 4, 2008 at 11:19 AM, Yonik Seeley [EMAIL

Re: Solrj + Multicore

2008-06-04 Thread Erik Hatcher
On Jun 4, 2008, at 10:07 AM, Alexander Ramos Jardim wrote: 2008/6/3 Ryan McKinley [EMAIL PROTECTED]: This way I don't connect: new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;) this is how you need to connect... otherwise nothing will work. When I try this way, I get the

Boost support for MoreLikeThis fields

2008-06-04 Thread Tom Morton
Hi, SOLR-295 https://issues.apache.org/jira/browse/SOLR-295 mentions boost support for morelikethis and then seems to have been subsumed by SOLR-281https://issues.apache.org/jira/browse/SOLR-281. To be clear, I'm talking about boosts for the mlt.fl fields and how they are ranked rather than for

Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-04 Thread Shalin Shekhar Mangar
Hi Julio, The following are my assumptions after studying your given data-config examples 1. The column id is present in all three tables -- vets, owners and pets. 2. Vets and owners are independent of each other, there is no join required between them 3. There is a parent-child relationship

Re: Solrj + Multicore

2008-06-04 Thread Alexander Ramos Jardim
It is mapped correctly. 2008/6/4 Erik Hatcher [EMAIL PROTECTED]: On Jun 4, 2008, at 10:07 AM, Alexander Ramos Jardim wrote: 2008/6/3 Ryan McKinley [EMAIL PROTECTED]: This way I don't connect: new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;) this is how you need to

RE: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-04 Thread Julio Castillo
Thanks Shalin, I'll try this asap. Yes, you did understand the sample schema I've been playing with. Just a couple of questions to clarify for my own understanding your proposal. 1) the column comboId doesn't exist on the dB (yet it is specified as a separate column for both owners and vets in

Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-04 Thread Shalin Shekhar Mangar
1. Yes you can add virtual columns. Any column/name which does not exist in schema.xml are used for joins but not added to the document. The column attribute is the key using which data is read from the entity's Map. The name attribute is the solr field to which data is written. Also note that

Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-04 Thread Shalin Shekhar Mangar
Another thing to note, the parentDeltaQuery is of no use if deltaQuery is not specified. In my experience, DataImportHandler is pretty fast and delta queries may not be needed at all if your dataset is small. We use it without delta queries even with millions of solr documents and it completes

Luke / Lucli w/ Solr index (trunk)

2008-06-04 Thread Jon Baer
Hi, Just recently upgraded w/ trunk version, Solr works fine but Luke Lucli are showing this: lucli index /www/solr/test/index Lucene CLI. Using directory '/www/solr/test/index'. Type 'help' for instructions. Error:org.apache.lucene.index.CorruptIndexException: Unknown format version:

Re: Luke / Lucli w/ Solr index (trunk)

2008-06-04 Thread Alexander Ramos Jardim
I got this error when trying to use an index generated by an old version with trunk Solr. 2008/6/4 Jon Baer [EMAIL PROTECTED]: Hi, Just recently upgraded w/ trunk version, Solr works fine but Luke Lucli are showing this: lucli index /www/solr/test/index Lucene CLI. Using directory

Re: Luke / Lucli w/ Solr index (trunk)

2008-06-04 Thread Otis Gospodnetic
Jon, Lucli is ancient and, as far as I recall, has its own Lucene jar, which you could try replacing. Luke might be better at this point, but even with Luke you may have to use your own Lucene jar (the one you used with Solr+DIH). The error indicates index format incompatibility (likely old

Re: Solrj + Multicore

2008-06-04 Thread Ryan McKinley
Are you using a recent version of multi-core? Do you have the /update RequestHandler mapped in solrconfig.xml? Since multicore support is new, it does not support the @deprecated / update servlet ryan On Jun 4, 2008, at 10:07 AM, Alexander Ramos Jardim wrote: 2008/6/3 Ryan McKinley

Re: Luke / Lucli w/ Solr index (trunk)

2008-06-04 Thread Grant Ingersoll
You will more than likely need to replace the Lucene jars in Luke with the ones used in Solr. We have upgraded Solr's Lucene dependencies in the trunk. I think, this means getting just the Luke jar (instead of Luke all) and then using it with the Lucene lib here. Double check the Luke

Re: Luke / Lucli w/ Solr index (trunk)

2008-06-04 Thread Jon Baer
Thanks ... that worked. Much appreciated. - Jon On Jun 4, 2008, at 4:51 PM, Grant Ingersoll wrote: You will more than likely need to replace the Lucene jars in Luke with the ones used in Solr. We have upgraded Solr's Lucene dependencies in the trunk. I think, this means getting just

POSTing repeated fields to Solr

2008-06-04 Thread Andrew Nagy
Hello - I was wondering if there is a work around with POSTing repeated fields to Solr. I am using Jetty as my container with Solr 1.2. I tried something like:

HttpDataSource common fields

2008-06-04 Thread Jon Baer
Hi, I have a question about the HttpDataSource (DataImportHandler) ... Is it possible add common values *explicitly*, something like: field column=column value=value commonField=true / Im blanking on if xpath has a command / option to return back just a string literal (vs. node). Thanks.

Re: what is null value behavior in function queries?

2008-06-04 Thread Chris Hostetter
: I am using function queries to rank the results, : if some/ allthe fields (used in the function ) are missing from the document : what will be the ranking behavior for such documents? Off the top of my head, I believe it's zero, but an easy way to check is to run a simple linear function

Re: new user: some questions about parameters and query syntax

2008-06-04 Thread Chris Hostetter
: I dunno... for something like fl, it still seems a bit verbose to : list every field separately. : Some of these things feel like trade offs in ease of readability : manually typing of URLs vs ease of programmatic manipulation. I don't even think the primary issue is programmatic

Re: ClassCastException trying to use distributed search

2008-06-04 Thread Chris Hostetter
: Hoss:Your are right. It has a version byte written first. This can be : used for any changes that come later..So , when we introduce any Cool .. i just wanted to make sure the issue here was the defualt format changing from XML to binary as part of the development cycle and not an expected

Re: POSTing repeated fields to Solr

2008-06-04 Thread Mike Klaas
On 4-Jun-08, at 2:22 PM, Andrew Nagy wrote: Hello - I was wondering if there is a work around with POSTing repeated fields to Solr. I am using Jetty as my container with Solr 1.2. I tried something like:

Re: HttpDataSource common fields

2008-06-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
commonField=true can be added in any field when you are using an XPathEntityProcessor.But you will never need to do so because only xml has such a requirement. If you wish to add a string literal use a TemplateTransformer and keep the field column=column value=value template=my-string-literal/ The

Re: HttpDataSource common fields

2008-06-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
attachment did not work try this http://www.nabble.com/Re%3A-How-to-describe-2-entities-in-dataConfig-for-the-DataImporter--p17577610.html --Noble On Thu, Jun 5, 2008 at 9:37 AM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED] wrote: commonField=true can be added in any field when you are using an

Multiple Schema File

2008-06-04 Thread Sachit P. Menon
Hi folks, I have a scenario as follows: I have a CMS where in I'm storing all the contents. I need to index all these contents and have a search on these indexes. For indexing, I can define a schema for all the contents. Some of the properties are like title, headline, body, keywords,

Re: Multiple Schema File

2008-06-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi , use multi-core Solr. Each core can have its own schema. If possible the DISCLAIMER can be dropped. --Noble On Thu, Jun 5, 2008 at 11:13 AM, Sachit P. Menon [EMAIL PROTECTED] wrote: Hi folks, I have a scenario as follows: I have a CMS where in I'm storing all the contents. I need to

Re: Multiple Schema File

2008-06-04 Thread climbingrose
Hi Sachit, I think what you could do is to create all the core fields of your models such as username, role, title, body, images... You can name them with prefix like user.username, user.role, article.title, article.body... If you want to dynamically add more fields to your schema, you can use