Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Marcus Herou
Out of my head... but are you not supposed to active the stream-handler in
SOLR ? Think it is documented...

Cheers
//Marcus


On Mon, Jul 6, 2009 at 8:55 PM, Francis Yakin  wrote:

> Yes, I uploaded the CSV file that I get it from Database then I ran that
> cmd and I have the error.
>
> Any suggestions?
>
> Thanks
>
> Francis
>
> -Original Message-
> From: NitinMalik [mailto:malik.ni...@yahoo.com]
> Sent: Monday, July 06, 2009 11:32 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Is there any other way to load the index beside using "http"
> connection?
>
>
> Hi Francis,
>
> I have experienced that update stream handler (for a xml file in my case)
> worked only for Solr running on the same machine. I also got same error
> when
> I tried to update the documents on a remote Solr instance.
>
> Regards
> Nitin
>
>
> Francis Yakin wrote:
> >
> >
> > Ok, I have a CSV file(called it test.csv) from database.
> >
> > When I tried to upload this file to solr using this cmd, I got
> > "stream.contentType=text/plain: No such file or directory" error
> >
> > curl
> >
> http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8
> >
> > -bash: stream.contentType=text/plain: No such file or directory
> >  undefined field cat
> >
> > What did I do wrong?
> >
> > Francis
> >
> > -Original Message-
> > From: Norberto Meijome [mailto:numard...@gmail.com]
> > Sent: Monday, July 06, 2009 11:01 AM
> > To: Francis Yakin
> > Cc: solr-user@lucene.apache.org
> > Subject: Re: Is there any other way to load the index beside using "http"
> > connection?
> >
> > On Mon, 6 Jul 2009 09:56:03 -0700
> > Francis Yakin  wrote:
> >
> >>  Norberto,
> >>
> >> Thanks, I think my questions is:
> >>
> >> >>why not generate your SQL output directly into your oracle server as a
> >> file
> >>
> >> What type of file is this?
> >>
> >>
> >
> > a file in a format that you can then import into SOLR.
> >
> > _
> > {Beto|Norberto|Numard} Meijome
> >
> > "Gravity cannot be blamed for people falling in love."
> >   Albert Einstein
> >
> > I speak for myself, not my employer. Contents may be hot. Slippery when
> > wet. Reading disclaimers makes you go blind. Writing them is worse. You
> > have been Warned.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/


Tagging and searching on tagged indexes.

2009-07-06 Thread Rakhi Khatwani
Hi,
 How do we tag solr indexes and search on those indexes, there is not
much information on wiki. all i could find is this:
http://wiki.apache.org/solr/UserTagDesign

has anyone tried it? (using solr API)


One more question, can we change the schema dynamically at runtime? (while
solr instance is on??)

Regards,
Raakhi.


Query on the updation of synonym and stopword file.

2009-07-06 Thread Sagar Khetkade

Hello All,
 
I was figuring out the issue with the synonym.txt and stopword.txt files being 
updated on regular interval. 
Here in my case  I am updating the synonym.txt and stopword.txt files as the 
synonym and stop word dictionary is update. I am facing a problem here that 
even after the core reload and re-indexing the documents the new updated 
synonym or stop words are not loaded. Seems so the filters are not aware that 
these files are updated so the solution to me is to restart the whole container 
in which I have embedded the Solr server; it is not feasible in production.
I  came across the discussion with subject “ synonyms.txt file updated 
frequently” in which Grant had a view to write a new logic in 
SynonymFilterFactory which would take care of this issue.  Is there any 
possible solution to this or is this the solution.
Thanks in advance!
 
Regards,
Sagar Khetkade
 
 
_
Missed any of the IPL matches ? Catch a recap of all the action on MSN Videos
http://msnvideos.in/iplt20/msnvideoplayer.aspx

Re: Filtering MoreLikeThis results

2009-07-06 Thread Bill Au
I have been trying to restrict MoreLikeThis results without any luck also.
In additional to restricting the results, I am also looking to influence the
scores similar to the way boost query (bq) works in the
DisMaxRequestHandler.

I think Solr's MoreLikeThis depends on Lucene's contrib queries
MoreLikeThis, or at least it used to.  Has anyone looked into enhancing
Solrs' MoreLikeThis to support bq and restricting mlt results?

Bill

On Mon, Jul 6, 2009 at 2:16 PM, Yao Ge  wrote:

>
> I could not find any support from http://wiki.apache.org/solr/MoreLikeThison
> how to restrict MLT results to certain subsets. I passed along a fq
> parameter and it is ignored. Since we can not incorporate the filters in
> the
> query itself which is used to retrieve the target for similarity
> comparison,
> it appears there is no way to filter MLT results. BTW. I am using Solr 1.3.
> Please let me know if there is way (other than hacking the source code) to
> do this. Thanks!
> --
> View this message in context:
> http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Creating DataSource for DIH to Oracle Database

2009-07-06 Thread Francis Yakin

Have any one had experience creating a datasource for DIH to an Oracle Database?

Also, from the Solr side we are running weblogic and deploy the application 
using weblogic.
I know in weblogic we can create a datasource that can connect to Oracle 
database, has any one had experience with this?


Thanks

Francis




Re: Suggestions needed: Lots of updates for tiny changes

2009-07-06 Thread Development Team
Hi Otis,
 Thanks for your reply and for giving it some thought.
 Actually we have considered using something that lives outside of the
main index... We've looked into using the ExternalFileField, but abandoned
that when it became clear we'd have to use a function to use it, and that
limited how we could use the field in our searches.
 For another more-real-time data problem we're having, we've considered
writing a search handler and search component to handle it as a
filter-query. This is equivalent to the "data structure outside of the main
index" that you have proposed. The problem with it is that getting it to be
*part of the index* is difficult.
 Well... any more ideas would be appreciated. But thanks for your help
so far.

- Daryl.


On Fri, Jul 3, 2009 at 9:34 PM, Otis Gospodnetic  wrote:

>
> I don't have a very specific suggestion, but I wonder if you could have a
> data structure that lives outside of the main index and keeps only these
> dates.  Presumably this smaller data structure would be simpler/faster to
> update, and you'd just have to remain in sync with the main index
> (document-document mapping).  I think ParallelReader in Lucene is a similar
> approach, as it Solr's ExternalFileField.
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Development Team 
> > To: solr-user@lucene.apache.org
> > Sent: Friday, July 3, 2009 4:46:37 PM
> > Subject: Suggestions needed: Lots of updates for tiny changes
> >
> > Hi everybody,
> >  Let's say I had an index with 10M large-ish documents, and as people
> > logged into a website and viewed them the "last viewed date" was updated
> to
> > the current time. We index a document's last-viewed-date because we allow
> > users to a) search on this last-viewed-date alongside all other
> searchable
> > criteria, and b) we can order results of any search by the
> last-viewed-date.
> >  The problem is that in a given 5-minute period, we may have many
> > thousands of updated documents (due to this simple last-viewed-date). We
> > have a task that looks for changed documents, loads the full documents,
> and
> > then feeds them into Solr to update the index, but unfortunately reading
> > these changed documents and continually feeding them to Solr is
> generating *
> > far* more load on our system (both Solr and the database) than any of the
> > searches. In a given day, *we may have more updates to documents than we
> > have total documents indexed*. (Databases don't handle this well either,
> the
> > contention on rows for updates slows the database down significantly.)
> >  How should we approach this problem? It seems like such a waste of
> > resources to be doing so much work in applications/database/solr only for
> > last-viewed-dates.
> >
> >  Solutions we've looked at include:
> >  1) Update only partial document. --Apparently this isn't supported
> in
> > Solr yet (we're using nightly Solr 1.4 builds currently).
> >  2) Use "near-real-time updates". --Not supported yet. Also, the
> > "freshness" of the data isn't as much as concern as the sheer volume of
> > changes that we have to make here. For example, we could update Solr
> > less-fequently, but then we'd just have many more documents to update.
> The
> > data only has to be, say, fresh to within 30 minutes.
> >  3) Use a separate index for the last-viewed-date. --This won't work
> > because we need to search on the last-viewed-date alongside other
> criteria,
> > and we use it as scoring criteria for all our searches.
> >
> >  Any suggestions?
> >
> > Sincerely,
> >
> >  Daryl.
>
>


Re: Problem in parsing non-string dynamic field by using IndexReader

2009-07-06 Thread Yuchen Wang
that works perfectly! Thanks a lot!

On Mon, Jul 6, 2009 at 2:12 PM, Chris Hostetter wrote:

> : OK, here is my latest code to get the IndexReader from the solr core.
> : However, it still printed out the non-string fields as special chars. I
> do
> : use the schema file here. Please help.
>
> you'll want to use the IndexSchema object to get the FieldType
> object for your field name.  then use the FieldType to convert the values
> in the index to readable values.
>
> Take a look at the javadocs for IndexSearcher and FieldType for more
> details.
>
> if you look at code like the XMLResponseWriter you'll see examples of
> iterating over all the fields in a Document and using those methods.
>
>
>
> -Hoss
>
>


Re: Problem in parsing non-string dynamic field by using IndexReader

2009-07-06 Thread Chris Hostetter
: OK, here is my latest code to get the IndexReader from the solr core.
: However, it still printed out the non-string fields as special chars. I do
: use the schema file here. Please help.

you'll want to use the IndexSchema object to get the FieldType 
object for your field name.  then use the FieldType to convert the values 
in the index to readable values.

Take a look at the javadocs for IndexSearcher and FieldType for more 
details.  

if you look at code like the XMLResponseWriter you'll see examples of 
iterating over all the fields in a Document and using those methods.



-Hoss



Re: reindexed data on master not replicated to slave

2009-07-06 Thread solr jay
It looks that the problem is here or before that in
SnapPuller.fetchLatestIndex():


  terminateAndWaitFsyncService();
  LOG.info("Conf files are not downloaded or are in sync");
  if (isSnapNeeded) {
modifyIndexProps(tmpIndexDir.getName());
  } else {
successfulInstall = copyIndexFiles(tmpIndexDir, indexDir);
  }
  if (successfulInstall) {
logReplicationTimeAndConfFiles(modifiedConfFiles);
doCommit();
  }


Debugged into the place, and noticed that isSnapNeeded is true and therefore

modifyIndexProps(tmpIndexDir.getName());

executed, but from the function name it looks that installing index actually
happens in

successfulInstall = copyIndexFiles(tmpIndexDir, indexDir);


The function returns false, but the caller (doSnapPull) never checked the
return value.


Thanks,

J


On Mon, Jul 6, 2009 at 8:02 AM, solr jay  wrote:

> There is only one index directory: index/
>
> Here is the content of index.properties
>
> #index properties
> #Fri Jul 03 14:17:12 PDT 2009
> index=index.20090703021705
>
>
> Thanks,
>
> J
>
> 2009/7/5 Noble Paul നോബിള്‍ नोब्ळ् 
>
> BTW , how many index dirs are there in the data dir ? what is there in
>> the /index.properties ?
>>
>> On Sat, Jul 4, 2009 at 12:15 AM, solr jay wrote:
>> >
>> >
>> > I tried it with the latest nightly build and got the same result.
>> >
>> > Actually that was the symptom and it made me looking at the index
>> directory.
>> > The same log messages repeated again and again, never end.
>> >
>> >
>> >
>> > 2009/7/2 Noble Paul നോബിള്‍ नोब्ळ् 
>> >>
>> >> jay , I see updating index properties... twice
>> >>
>> >>
>> >>
>> >> this should happen rarely. in your case it should have happened only
>> >> once. because you cleaned up the master only once
>> >>
>> >>
>> >> On Fri, Jul 3, 2009 at 6:09 AM, Otis
>> >> Gospodnetic wrote:
>> >> >
>> >> > Jay,
>> >> >
>> >> > You didn't mention which version of Solr you are using.  It looks
>> like
>> >> > some trunk or nightly version.  Maybe you can try the latest nightly?
>> >> >
>> >> >  Otis
>> >> > --
>> >> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >> >
>> >> >
>> >> >
>> >> > - Original Message 
>> >> >> From: solr jay 
>> >> >> To: solr-user@lucene.apache.org
>> >> >> Sent: Thursday, July 2, 2009 9:14:48 PM
>> >> >> Subject: reindexed data on master not replicated to slave
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> When index data were corrupted on master instance, I wanted to wipe
>> out
>> >> >> all
>> >> >> the index data and re-index everything. I was hoping the newly
>> created
>> >> >> index
>> >> >> data would be replicated to slaves, but it wasn't.
>> >> >>
>> >> >> Here are the steps I performed:
>> >> >>
>> >> >> 1. stop master
>> >> >> 2. delete the directory 'index'
>> >> >> 3. start master
>> >> >> 4. disable replication on master
>> >> >> 5. index all data from scratch
>> >> >> 6. enable replication on master
>> >> >>
>> >> >> It seemed from log file that the slave instances discovered that new
>> >> >> index
>> >> >> are available and claimed that new index installed, and then trying
>> to
>> >> >> update index properties, but looking into the index directory on
>> >> >> slaves, you
>> >> >> will find that no index data files were updated or added, plus
>> slaves
>> >> >> keep
>> >> >> trying to get new index. Here are some from slave's log file:
>> >> >>
>> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Starting replication process
>> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Number of files in latest snapshot in master: 69
>> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Total time taken for download : 0 secs
>> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Conf files are not downloaded or are in sync
>> >> >> Jul 1, 2009 3:59:33 PM org.apache.solr.handler.SnapPuller
>> >> >> modifyIndexProps
>> >> >> INFO: New index installed. Updating index properties...
>> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Master's version: 1246488421310, generation: 9
>> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Slave's version: 1246385166228, generation: 56
>> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Starting replication process
>> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Number of files in latest snapshot in master: 69
>> >> >> Jul 1, 2009 4:00:33 PM org.apache.solr.handler.SnapPuller
>> >> >> fetchLatestIndex
>> >> >> INFO: Total time taken for download : 0 secs
>> >> >> Jul 1, 2009 4:00:33 PM org.apache.sol

RE: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Francis Yakin
Yes, I uploaded the CSV file that I get it from Database then I ran that cmd 
and I have the error.

Any suggestions?

Thanks

Francis

-Original Message-
From: NitinMalik [mailto:malik.ni...@yahoo.com]
Sent: Monday, July 06, 2009 11:32 AM
To: solr-user@lucene.apache.org
Subject: RE: Is there any other way to load the index beside using "http" 
connection?


Hi Francis,

I have experienced that update stream handler (for a xml file in my case)
worked only for Solr running on the same machine. I also got same error when
I tried to update the documents on a remote Solr instance.

Regards
Nitin


Francis Yakin wrote:
>
>
> Ok, I have a CSV file(called it test.csv) from database.
>
> When I tried to upload this file to solr using this cmd, I got
> "stream.contentType=text/plain: No such file or directory" error
>
> curl
> http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8
>
> -bash: stream.contentType=text/plain: No such file or directory
>  undefined field cat
>
> What did I do wrong?
>
> Francis
>
> -Original Message-
> From: Norberto Meijome [mailto:numard...@gmail.com]
> Sent: Monday, July 06, 2009 11:01 AM
> To: Francis Yakin
> Cc: solr-user@lucene.apache.org
> Subject: Re: Is there any other way to load the index beside using "http"
> connection?
>
> On Mon, 6 Jul 2009 09:56:03 -0700
> Francis Yakin  wrote:
>
>>  Norberto,
>>
>> Thanks, I think my questions is:
>>
>> >>why not generate your SQL output directly into your oracle server as a
>> file
>>
>> What type of file is this?
>>
>>
>
> a file in a format that you can then import into SOLR.
>
> _
> {Beto|Norberto|Numard} Meijome
>
> "Gravity cannot be blamed for people falling in love."
>   Albert Einstein
>
> I speak for myself, not my employer. Contents may be hot. Slippery when
> wet. Reading disclaimers makes you go blind. Writing them is worse. You
> have been Warned.
>
>

--
View this message in context: 
http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread NitinMalik

Hi Francis,

I have experienced that update stream handler (for a xml file in my case)
worked only for Solr running on the same machine. I also got same error when
I tried to update the documents on a remote Solr instance.

Regards
Nitin
   

Francis Yakin wrote:
> 
> 
> Ok, I have a CSV file(called it test.csv) from database.
> 
> When I tried to upload this file to solr using this cmd, I got
> "stream.contentType=text/plain: No such file or directory" error
> 
> curl
> http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8
> 
> -bash: stream.contentType=text/plain: No such file or directory
>  undefined field cat
> 
> What did I do wrong?
> 
> Francis
> 
> -Original Message-
> From: Norberto Meijome [mailto:numard...@gmail.com]
> Sent: Monday, July 06, 2009 11:01 AM
> To: Francis Yakin
> Cc: solr-user@lucene.apache.org
> Subject: Re: Is there any other way to load the index beside using "http"
> connection?
> 
> On Mon, 6 Jul 2009 09:56:03 -0700
> Francis Yakin  wrote:
> 
>>  Norberto,
>>
>> Thanks, I think my questions is:
>>
>> >>why not generate your SQL output directly into your oracle server as a
>> file
>>
>> What type of file is this?
>>
>>
> 
> a file in a format that you can then import into SOLR.
> 
> _
> {Beto|Norberto|Numard} Meijome
> 
> "Gravity cannot be blamed for people falling in love."
>   Albert Einstein
> 
> I speak for myself, not my employer. Contents may be hot. Slippery when
> wet. Reading disclaimers makes you go blind. Writing them is worse. You
> have been Warned.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html
Sent from the Solr - User mailing list archive at Nabble.com.



Filtering MoreLikeThis results

2009-07-06 Thread Yao Ge

I could not find any support from http://wiki.apache.org/solr/MoreLikeThis on
how to restrict MLT results to certain subsets. I passed along a fq
parameter and it is ignored. Since we can not incorporate the filters in the
query itself which is used to retrieve the target for similarity comparison,
it appears there is no way to filter MLT results. BTW. I am using Solr 1.3. 
Please let me know if there is way (other than hacking the source code) to
do this. Thanks!
-- 
View this message in context: 
http://www.nabble.com/Filtering-MoreLikeThis-results-tp24360355p24360355.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Francis Yakin

Ok, I have a CSV file(called it test.csv) from database.

When I tried to upload this file to solr using this cmd, I got 
"stream.contentType=text/plain: No such file or directory" error

curl 
http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv&stream.contentType=text/plain;charset=utf-8

-bash: stream.contentType=text/plain: No such file or directory
 undefined field cat

What did I do wrong?

Francis

-Original Message-
From: Norberto Meijome [mailto:numard...@gmail.com]
Sent: Monday, July 06, 2009 11:01 AM
To: Francis Yakin
Cc: solr-user@lucene.apache.org
Subject: Re: Is there any other way to load the index beside using "http" 
connection?

On Mon, 6 Jul 2009 09:56:03 -0700
Francis Yakin  wrote:

>  Norberto,
>
> Thanks, I think my questions is:
>
> >>why not generate your SQL output directly into your oracle server as a file
>
> What type of file is this?
>
>

a file in a format that you can then import into SOLR.

_
{Beto|Norberto|Numard} Meijome

"Gravity cannot be blamed for people falling in love."
  Albert Einstein

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Norberto Meijome
On Mon, 6 Jul 2009 09:56:03 -0700
Francis Yakin  wrote:

>  Norberto,
> 
> Thanks, I think my questions is:
> 
> >>why not generate your SQL output directly into your oracle server as a file
> 
> What type of file is this?
> 
> 

a file in a format that you can then import into SOLR. 

_
{Beto|Norberto|Numard} Meijome

"Gravity cannot be blamed for people falling in love."
  Albert Einstein

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


RE: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Francis Yakin
 Norberto,

Thanks, I think my questions is:

>>why not generate your SQL output directly into your oracle server as a file

What type of file is this?


Thanks again for your help.


Francis

-Original Message-
From: Norberto Meijome [mailto:numard...@gmail.com]
Sent: Monday, July 06, 2009 4:33 AM
To: Francis Yakin
Cc: solr-user@lucene.apache.org
Subject: Re: Is there any other way to load the index beside using "http" 
connection?

On Sun, 5 Jul 2009 10:28:16 -0700
Francis Yakin  wrote:

[...]>
> >upload the file to your SOLR server? Then the data file is local to your SOLR
> >server , you will bypass any WAN and firewall you may be having. (or some
> >variation of it, sql -> SOLR server as file, etc..)
>
> How we upload the file? Do we need to convert the data file to Lucene Index
> first? And Documentation how we do this?

pick your poison... rsync? ftp? scp ?

B
_
{Beto|Norberto|Numard} Meijome

"The freethinking of one age is the common sense of the next."
   Matthew Arnold

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Replication In 1.4

2009-07-06 Thread Otis Gospodnetic

And if you don't mind using the nightly Solr build, the admin page caching has 
been fixed in the trunk, so this won't bite you again.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Lee Theobald 
> To: solr-user@lucene.apache.org
> Sent: Monday, July 6, 2009 11:41:16 AM
> Subject: Re: Replication In 1.4
> 
> 
> The problem turned out to be two fold.  Firstly, Tomcat was caching an old
> version on the admin page (hiding the replication link and some other info)
> and secondly I had a mistake in some configuration meaning the indexes
> weren't building in the correct places.  But it's sorted now and I have some
> lovely replication working with my two cores.
> 
> Thanks for your help Mark.
> -- 
> View this message in context: 
> http://www.nabble.com/Replication-In-1.4-tp24356158p24357821.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using MMapDirectory

2009-07-06 Thread Otis Gospodnetic

Is there a benefit to using MMapDirectory instead of, say, tmpfs (RAM disk) 
under Linux?

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Mark Miller 
> To: solr-user@lucene.apache.org
> Sent: Monday, July 6, 2009 9:28:43 AM
> Subject: Re: Using MMapDirectory
> 
> Marc Sturlese wrote:
> > Hey there,
> > 
> > For testing purpose I am trying to use lucene's MMapDirectory in a recent
> > Solr nightly instance. I have read in Lucene's documentation:
> > "To use MMapDirectory, invoke Java with the System property
> > org.apache.lucene.FSDirectory.class set to
> > org.apache.lucene.store.MMapDirectory. This will cause
> > FSDirectory.getDirectory(File,boolean) to return instances of this class. "
> > 
> > Do I have to change something in solrconfig.xml or modifying system property
> > is just enough?
> > Thanks in advance.
> >  
> The system property won't do it. You will have to try the custom 
> DirectoryFactory in solrconfig.xml.
> 
> Report back how it goes - havn't tried Solr with MMapDirectory before myself.
> 
> -- - Mark
> 
> http://www.lucidimagination.com



Re: Multiple values for custom fields provided in SOLR query

2009-07-06 Thread Otis Gospodnetic

I actually don't fully understand your question.
q=+fileID:111+fileID:222+fileID:333+apple looks like a valid query to me.
(not sure what that space encoded as + is, though)

Also not sure what you mean by:
> Basically the requirement is , if fileIDs are provided as search parameter
> then search should happen on the basis of fileID.


Do you mean "apple" should be ignored if a term (field name:field value) is 
provided?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Suryasnat Das 
> To: solr-user@lucene.apache.org
> Sent: Monday, July 6, 2009 11:31:10 AM
> Subject: Multiple values for custom fields provided in SOLR query
> 
> Hi,
> I have a requirement in which i need to have multiple values in my custom
> fields while forming the search query to SOLR. For example,
> fileID is my custom field. I have defined the fileID in schema.xml as 
> name="fileID" type="string" indexed="true" stored="true" required="true"
> multiValued="true"/>.
> Now fileID can have multiple values like 111,222,333 etc. So will my query
> be of the form,
> 
> q=+fileID:111+fileID:222+fileID:333+apple
> 
> where apple is my search query string. I tried with the above query but it
> did not work. SOLR gave invalid query error.
> Basically the requirement is , if fileIDs are provided as search parameter
> then search should happen on the basis of fileID.
> 
> Is my approach correct or i need to do something else? Please, if immediate
> help is provided then that would be great.
> 
> Regards
> Suryasnat Das
> Infosys.



Re: Retrieve docs with > 1 multivalue field hits

2009-07-06 Thread Otis Gospodnetic

I don't recall seeing that mentioned at all... but my memory fails me all the 
time. Who's Solr?

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: A. Steven Anderson 
> To: solr-user@lucene.apache.org
> Sent: Monday, July 6, 2009 12:25:48 PM
> Subject: Re: Retrieve docs with > 1 multivalue field hits
> 
> I thought this would be a quick yes or no answer and/or reference to another
> thread, but alas, I got no replies.
> 
> Is it safe to assume the answer is 'no' for both Solr 1.3 and 1.4?
> 
> 
> On Thu, Jul 2, 2009 at 3:48 PM, A. Steven Anderson wrote:
> 
> > Greetings!
> >
> > I thought I remembered seeing a thread related to retrieving only documents
> > that had more than one hit in a particular multivalue field, but I cannot
> > find it now.
> >
> > Regardless, is this possible in Solr 1.3? Solr 1.4?
> >
> 
> -- 
> A. Steven Anderson
> Independent Consultant



Re: Retrieve docs with > 1 multivalue field hits

2009-07-06 Thread A. Steven Anderson
I thought this would be a quick yes or no answer and/or reference to another
thread, but alas, I got no replies.

Is it safe to assume the answer is 'no' for both Solr 1.3 and 1.4?


On Thu, Jul 2, 2009 at 3:48 PM, A. Steven Anderson wrote:

> Greetings!
>
> I thought I remembered seeing a thread related to retrieving only documents
> that had more than one hit in a particular multivalue field, but I cannot
> find it now.
>
> Regardless, is this possible in Solr 1.3? Solr 1.4?
>

-- 
A. Steven Anderson
Independent Consultant


Re: Popular keywords statistics .

2009-07-06 Thread Alexander Wallace

Indeed that was one of the first  approaches...

Thanks a lot!

Michael Ludwig wrote:

Wallace schrieb:

I'd like to hear what approaches are being used by users to know what
people is searching for in their apps.


You could process the access log.

You could write a filter servlet logging the relevant part of the query
string to a dedicated location.

Michael Ludwig




grouping and sorting by facet?

2009-07-06 Thread Peter Keane
Sorry if I am missing something obvious here

Is there a way to group and sort by facet count?  I have a large set of
images, each of which is part of a different "collection."  I am performing
a faceted search:

/solr/select/?q=my+term&max=30&version=2.2&rows=30&start=0&facet=true&facet.field=collection&facet.sort=true

I would like to group the results by collection count.

So all of the images in the collection with the most image "hits" comes
first.

Not sure how to do that

--Peter Keane


Re: Replication In 1.4

2009-07-06 Thread Lee Theobald

The problem turned out to be two fold.  Firstly, Tomcat was caching an old
version on the admin page (hiding the replication link and some other info)
and secondly I had a mistake in some configuration meaning the indexes
weren't building in the correct places.  But it's sorted now and I have some
lovely replication working with my two cores.

Thanks for your help Mark.
-- 
View this message in context: 
http://www.nabble.com/Replication-In-1.4-tp24356158p24357821.html
Sent from the Solr - User mailing list archive at Nabble.com.



Multiple values for custom fields provided in SOLR query

2009-07-06 Thread Suryasnat Das
Hi,
I have a requirement in which i need to have multiple values in my custom
fields while forming the search query to SOLR. For example,
fileID is my custom field. I have defined the fileID in schema.xml as .
Now fileID can have multiple values like 111,222,333 etc. So will my query
be of the form,

q=+fileID:111+fileID:222+fileID:333+apple

where apple is my search query string. I tried with the above query but it
did not work. SOLR gave invalid query error.
Basically the requirement is , if fileIDs are provided as search parameter
then search should happen on the basis of fileID.

Is my approach correct or i need to do something else? Please, if immediate
help is provided then that would be great.

Regards
Suryasnat Das
Infosys.


Re: Replication In 1.4

2009-07-06 Thread Mark Miller
On Mon, Jul 6, 2009 at 10:43 AM, Lee Theobald  wrote:

>
> I think it's best if I just clear down and start again.  I've got something
> wrong somewhere.  I've noticed that what I consider to be the slave does
> have a "Replication" link in the admin area.  My master doesn't.  They both
> seem to be reporting the same version from the info page
> (1.3.0.2009.06.30.08.05.44, nightly exported - yonik - 2009-06-30 08:05:44)
> but I'm guessing I may have something pointing to an old 1.3 solr war/jar.
> --
> View this message in context:
> http://www.nabble.com/Replication-In-1.4-tp24356158p24356750.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
You should see the link based on the replication RequestHandler being
detected. Is it still commented out on the Master solrconfig?

-- 
-- 
- Mark

http://www.lucidimagination.com


Re: Replication In 1.4

2009-07-06 Thread Lee Theobald

I think it's best if I just clear down and start again.  I've got something
wrong somewhere.  I've noticed that what I consider to be the slave does
have a "Replication" link in the admin area.  My master doesn't.  They both
seem to be reporting the same version from the info page
(1.3.0.2009.06.30.08.05.44, nightly exported - yonik - 2009-06-30 08:05:44)
but I'm guessing I may have something pointing to an old 1.3 solr war/jar.
-- 
View this message in context: 
http://www.nabble.com/Replication-In-1.4-tp24356158p24356750.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Replication In 1.4

2009-07-06 Thread Lee Theobald

Cheers for the info Mark.  That looks pretty similar to what I have.  Slave
is almost the same, master was slightly different but I don't think
incorrect:


  optimize
  optimize
  solrconfig_slave.xml:solrconfig.xml,schema.xml,stopwords.txt,elevate.xml


I'll keep looking as I'm bound to have missed something but I can't quite
see what it is yet.

Lee,


markrmiller wrote:
> 
> Have you uncommented the proper RequestHandlers in solrconfig.xml?
> 
> 
> 
> 
> 
> 
> 
> -- 
> - Mark
> 
> http://www.lucidimagination.com
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Replication-In-1.4-tp24356158p24356526.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Replication In 1.4

2009-07-06 Thread Mark Miller

Lee Theobald wrote:

Hi all,

I've been trying to get the replication working in my test version of Solr
(1.4) but I don't think I've got it right.  There doesn't seem to be any
errors but it doesn't seem to be working either.  I'm going to the
distribution admin page [1] and just getting a bit of text telling me "No
distribution info present".  I've tried with multiple cores (what I want but
perhaps isn't possible looking at the raised bugs) and a single core, both
give the same results.  Going to the replication status page [2] gives me an
OK.

I'm thinking I've missed a vital bit of configuration.  Is there anything on
this page [3] that is missing but I need?  For example, listeners in the
solrconfig.xml?  If so, could someone please give me an example.

Cheers for any input,
Lee

[1] http://localhost:8080/solr/admin/distributiondump.jsp
[2] http://localhost:8080/solr/replication
[3] http://wiki.apache.org/solr/SolrReplication
  

Have you uncommented the proper RequestHandlers in solrconfig.xml?








--
- Mark

http://www.lucidimagination.com





Faceting with MoreLikeThis

2009-07-06 Thread Yao Ge

Does Solr support faceting on MoreLikeThis search results?
-- 
View this message in context: 
http://www.nabble.com/Faceting-with-MoreLikeThis-tp24356166p24356166.html
Sent from the Solr - User mailing list archive at Nabble.com.



Replication In 1.4

2009-07-06 Thread Lee Theobald

Hi all,

I've been trying to get the replication working in my test version of Solr
(1.4) but I don't think I've got it right.  There doesn't seem to be any
errors but it doesn't seem to be working either.  I'm going to the
distribution admin page [1] and just getting a bit of text telling me "No
distribution info present".  I've tried with multiple cores (what I want but
perhaps isn't possible looking at the raised bugs) and a single core, both
give the same results.  Going to the replication status page [2] gives me an
OK.

I'm thinking I've missed a vital bit of configuration.  Is there anything on
this page [3] that is missing but I need?  For example, listeners in the
solrconfig.xml?  If so, could someone please give me an example.

Cheers for any input,
Lee

[1] http://localhost:8080/solr/admin/distributiondump.jsp
[2] http://localhost:8080/solr/replication
[3] http://wiki.apache.org/solr/SolrReplication
-- 
View this message in context: 
http://www.nabble.com/Replication-In-1.4-tp24356158p24356158.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using MMapDirectory

2009-07-06 Thread Mark Miller

Marc Sturlese wrote:

Hey there,

For testing purpose I am trying to use lucene's MMapDirectory in a recent
Solr nightly instance. I have read in Lucene's documentation:
"To use MMapDirectory, invoke Java with the System property
org.apache.lucene.FSDirectory.class set to
org.apache.lucene.store.MMapDirectory. This will cause
FSDirectory.getDirectory(File,boolean) to return instances of this class. "

Do I have to change something in solrconfig.xml or modifying system property
is just enough?
Thanks in advance.
  
The system property won't do it. You will have to try the custom 
DirectoryFactory in solrconfig.xml.


Report back how it goes - havn't tried Solr with MMapDirectory before 
myself.


--
- Mark

http://www.lucidimagination.com





Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Marcus Herou
Yes exactly just being friendly sharing a working routine. Took me some
hours to figure out DIH myself at the time.

//Marcus

On Mon, Jul 6, 2009 at 1:32 PM, Norberto Meijome wrote:

> On Sun, 5 Jul 2009 21:36:35 +0200
> Marcus Herou  wrote:
>
> > Sharing some of our exports from DB to solr. Note: many of the statements
> > below might not work due to clip-clip.
>
> thx Marcus - but that's a DIH config right? :)
> b
> _
> {Beto|Norberto|Numard} Meijome
>
> "I respect faith, but doubt is what gives you an education."
>   Wilson Mizner
>
> I speak for myself, not my employer. Contents may be hot. Slippery when
> wet. Reading disclaimers makes you go blind. Writing them is worse. You have
> been Warned.
>



-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/


Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Norberto Meijome
On Sun, 5 Jul 2009 10:28:16 -0700
Francis Yakin  wrote:

[...]> 
> >upload the file to your SOLR server? Then the data file is local to your SOLR
> >server , you will bypass any WAN and firewall you may be having. (or some
> >variation of it, sql -> SOLR server as file, etc..)
> 
> How we upload the file? Do we need to convert the data file to Lucene Index
> first? And Documentation how we do this?

pick your poison... rsync? ftp? scp ? 

B
_
{Beto|Norberto|Numard} Meijome

"The freethinking of one age is the common sense of the next."
   Matthew Arnold

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Norberto Meijome
On Sun, 5 Jul 2009 21:36:35 +0200
Marcus Herou  wrote:

> Sharing some of our exports from DB to solr. Note: many of the statements
> below might not work due to clip-clip.

thx Marcus - but that's a DIH config right? :)
b
_
{Beto|Norberto|Numard} Meijome

"I respect faith, but doubt is what gives you an education."
   Wilson Mizner

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.


Using MMapDirectory

2009-07-06 Thread Marc Sturlese

Hey there,

For testing purpose I am trying to use lucene's MMapDirectory in a recent
Solr nightly instance. I have read in Lucene's documentation:
"To use MMapDirectory, invoke Java with the System property
org.apache.lucene.FSDirectory.class set to
org.apache.lucene.store.MMapDirectory. This will cause
FSDirectory.getDirectory(File,boolean) to return instances of this class. "

Do I have to change something in solrconfig.xml or modifying system property
is just enough?
Thanks in advance.
-- 
View this message in context: 
http://www.nabble.com/Using-MMapDirectory-tp24353063p24353063.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Index partitioning with solr multiple core feature

2009-07-06 Thread Sumit Aggarwal
Shalin,
First of all each entity data is unrelated so it makes sense to use solr
core concept as per your suggestion.

But Since you are suggesting putting each entity index on same box will
consume CPU so does it make sense to add boxes based on number of
entities considering i will have to add replication boxes also amounting a
huge cost.

This is what i am thinking after your suggestion - Have separate boxes for
each entity and then inside each entity do some partitioning based on round
robin or some strategy. With this if i am searching on any entity data then
i will just require to reach a box for that entity. Now since i am doing a
partitioning inside an entity also how i will do search for data so that i
got merged result from each partition in a single entity box. If i doing
this type of partitioning than which functionality of solr i will use ... is
it http://wiki.apache.org/solr/IndexPartitioning ?

My actual concern is performance irrespective of implementation
design considering a good scaling logic also for future .


On Mon, Jul 6, 2009 at 3:16 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> On Mon, Jul 6, 2009 at 3:05 PM, Sumit Aggarwal  >wrote:
>
> > Hi Shalin,
> > Yes i want to achieve a logical separation of indexes for performance
> > reason
> > also else index size will keep on growing as i have 8 different entities.
> I
> > am already partitioning all these entities to different servers also on
> > which i will be doing search based on distributed search by solr using
> > shards and collecting merged results from 3 different servers. You
> > mentioned
> > i wont achieve putting all partitions on the same box , why is that so?
>
>
> This is because each shard will compete for CPU and disk if you put them on
> the same box. Logical separation and partitioning for performance are two
> different things. You should partition if one Solr instance is not able to
> hold the complete index or if it is not giving you the desired performance.
> You can use multiple cores if the data is unrelated and you wouldn't need
> to
> search on all of them.
>
> In your case, the primary reason is performance, so it makes sense to put
> each shard on a separate box.
>
>
> > While reading solr core it says solr core is used for different
> > applications
> > only My search on different entities is also a type of different
> > applications theoritically 
> >
> > Does solr provides any good support for index partitioning.
>
>
> No. Partitioning is not done by Solr. So you should decide your
> partitioning
> scheme: round robin, fixed hashing, random etc. Once you have partitioned
> your data, a distributed search helps you search over all the shards in one
> go.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
Cheers
Sumit


Re: Popular keywords statistics .

2009-07-06 Thread Michael Ludwig

Wallace schrieb:

I'd like to hear what approaches are being used by users to know what
people is searching for in their apps.


You could process the access log.

You could write a filter servlet logging the relevant part of the query
string to a dedicated location.

Michael Ludwig


Re: Index partitioning with solr multiple core feature

2009-07-06 Thread Shalin Shekhar Mangar
On Mon, Jul 6, 2009 at 3:05 PM, Sumit Aggarwal wrote:

> Hi Shalin,
> Yes i want to achieve a logical separation of indexes for performance
> reason
> also else index size will keep on growing as i have 8 different entities. I
> am already partitioning all these entities to different servers also on
> which i will be doing search based on distributed search by solr using
> shards and collecting merged results from 3 different servers. You
> mentioned
> i wont achieve putting all partitions on the same box , why is that so?


This is because each shard will compete for CPU and disk if you put them on
the same box. Logical separation and partitioning for performance are two
different things. You should partition if one Solr instance is not able to
hold the complete index or if it is not giving you the desired performance.
You can use multiple cores if the data is unrelated and you wouldn't need to
search on all of them.

In your case, the primary reason is performance, so it makes sense to put
each shard on a separate box.


> While reading solr core it says solr core is used for different
> applications
> only My search on different entities is also a type of different
> applications theoritically 
>
> Does solr provides any good support for index partitioning.


No. Partitioning is not done by Solr. So you should decide your partitioning
scheme: round robin, fixed hashing, random etc. Once you have partitioned
your data, a distributed search helps you search over all the shards in one
go.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Index partitioning with solr multiple core feature

2009-07-06 Thread Sumit Aggarwal
Shalin,
at a time i will be doing search only on one entity... Also data will be
indexed only to corresponding entity.

Thanks,
Sumit

On Mon, Jul 6, 2009 at 3:05 PM, Sumit Aggarwal wrote:

> Hi Shalin,
> Yes i want to achieve a logical separation of indexes for performance
> reason also else index size will keep on growing as i have 8 different
> entities. I am already partitioning all these entities to different servers
> also on which i will be doing search based on distributed search by solr
> using shards and collecting merged results from 3 different servers. You
> mentioned i wont achieve putting all partitions on the same box , why is
> that so?
>
> While reading solr core it says solr core is used for different
> applications only My search on different entities is also a type of
> different applications theoritically 
>
> Does solr provides any good support for index partitioning.
> Thanks,
> Sumit
>
> On Mon, Jul 6, 2009 at 2:43 PM, Shalin Shekhar Mangar <
> shalinman...@gmail.com> wrote:
>
>> On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal > >wrote:
>>
>> > I was trying to implement entity based partitioning using multiple core
>> > feature.
>> > So my solr.xml is like :
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> >
>> > Now using http://localhost:8983/solr/User/ or
>> > http://localhost:8983/solr/Group/ i am able to reach seperate partition
>> > for
>> > entity based search. Now question arises for entity based indexing. I
>> was
>> > reading http://wiki.apache.org/solr/IndexPartitioning document but it
>> does
>> > not help much How can i do entity based indexing of document..
>> > I don't want to make http url based on entity for indexing purpose.
>>
>>
>> Why not? You know which document belongs to which "entity" so you can
>> select
>> which core to post that document to.
>>
>>
>>
>> > Another requirement: Since i have entity based partitioning and each
>> entity
>> > can have total index size more than 10GB so i need another partitioning
>> > inside entity like based on no of document in an index inside entity.
>> How
>> > can i do this? Unfortunately solr wiki does not says much on
>> partitioning..
>> >
>>
>> What are you trying to achieve by partitioning your data? Is it just for
>> logical separation? If it is for performance reasons, I don't think you'll
>> gain much by putting all partitions on the same box.
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>
>
> --
> Cheers
> Sumit
> 9818621804
>



-- 
Cheers
Sumit
9818621804


Re: Index partitioning with solr multiple core feature

2009-07-06 Thread Sumit Aggarwal
Hi Shalin,
Yes i want to achieve a logical separation of indexes for performance reason
also else index size will keep on growing as i have 8 different entities. I
am already partitioning all these entities to different servers also on
which i will be doing search based on distributed search by solr using
shards and collecting merged results from 3 different servers. You mentioned
i wont achieve putting all partitions on the same box , why is that so?

While reading solr core it says solr core is used for different applications
only My search on different entities is also a type of different
applications theoritically 

Does solr provides any good support for index partitioning.
Thanks,
Sumit

On Mon, Jul 6, 2009 at 2:43 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal  >wrote:
>
> > I was trying to implement entity based partitioning using multiple core
> > feature.
> > So my solr.xml is like :
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >
> > Now using http://localhost:8983/solr/User/ or
> > http://localhost:8983/solr/Group/ i am able to reach seperate partition
> > for
> > entity based search. Now question arises for entity based indexing. I was
> > reading http://wiki.apache.org/solr/IndexPartitioning document but it
> does
> > not help much How can i do entity based indexing of document..
> > I don't want to make http url based on entity for indexing purpose.
>
>
> Why not? You know which document belongs to which "entity" so you can
> select
> which core to post that document to.
>
>
>
> > Another requirement: Since i have entity based partitioning and each
> entity
> > can have total index size more than 10GB so i need another partitioning
> > inside entity like based on no of document in an index inside entity. How
> > can i do this? Unfortunately solr wiki does not says much on
> partitioning..
> >
>
> What are you trying to achieve by partitioning your data? Is it just for
> logical separation? If it is for performance reasons, I don't think you'll
> gain much by putting all partitions on the same box.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
Cheers
Sumit
9818621804


Re: Index partitioning with solr multiple core feature

2009-07-06 Thread Shalin Shekhar Mangar
On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal wrote:

> I was trying to implement entity based partitioning using multiple core
> feature.
> So my solr.xml is like :
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
> Now using http://localhost:8983/solr/User/ or
> http://localhost:8983/solr/Group/ i am able to reach seperate partition
> for
> entity based search. Now question arises for entity based indexing. I was
> reading http://wiki.apache.org/solr/IndexPartitioning document but it does
> not help much How can i do entity based indexing of document..
> I don't want to make http url based on entity for indexing purpose.


Why not? You know which document belongs to which "entity" so you can select
which core to post that document to.



> Another requirement: Since i have entity based partitioning and each entity
> can have total index size more than 10GB so i need another partitioning
> inside entity like based on no of document in an index inside entity. How
> can i do this? Unfortunately solr wiki does not says much on partitioning..
>

What are you trying to achieve by partitioning your data? Is it just for
logical separation? If it is for performance reasons, I don't think you'll
gain much by putting all partitions on the same box.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Index partitioning with solr multiple core feature

2009-07-06 Thread Sumit Aggarwal
I forgot to mention i already have a partitioning to 3 different servers for
each entity based on some unique int value.

On Mon, Jul 6, 2009 at 1:40 PM, Sumit Aggarwal wrote:

> I was trying to implement entity based partitioning using multiple core
> feature.
> So my solr.xml is like :
> 
> 
>  
> 
>  
> 
>  
> 
>  
> 
>
> Now using http://localhost:8983/solr/User/ or
> http://localhost:8983/solr/Group/ i am able to reach seperate partition
> for entity based search. Now question arises for entity based indexing. I
> was reading http://wiki.apache.org/solr/IndexPartitioning document but it
> does not help much How can i do entity based indexing of document..
> I don't want to make http url based on entity for indexing purpose. Kindly
> help me in this?
>
> Another requirement: Since i have entity based partitioning and each entity
> can have total index size more than 10GB so i need another partitioning
> inside entity like based on no of document in an index inside entity. How
> can i do this? Unfortunately solr wiki does not says much on partitioning..
> --
> Cheers
> Sumit
>



-- 
Cheers
Sumit
9818621804


Index partitioning with solr multiple core feature

2009-07-06 Thread Sumit Aggarwal
I was trying to implement entity based partitioning using multiple core
feature.
So my solr.xml is like :











Now using http://localhost:8983/solr/User/ or
http://localhost:8983/solr/Group/ i am able to reach seperate partition for
entity based search. Now question arises for entity based indexing. I was
reading http://wiki.apache.org/solr/IndexPartitioning document but it does
not help much How can i do entity based indexing of document..
I don't want to make http url based on entity for indexing purpose. Kindly
help me in this?

Another requirement: Since i have entity based partitioning and each entity
can have total index size more than 10GB so i need another partitioning
inside entity like based on no of document in an index inside entity. How
can i do this? Unfortunately solr wiki does not says much on partitioning..
-- 
Cheers
Sumit