Re: Possible Containers

2009-06-15 Thread John Martyniak
I have been using jetty and have been really happy with the ease of  
use and performance.


-John

On Jun 15, 2009, at 3:41 PM, Andrew Oliver wrote:


I've had it running in Jetty and Tomcat.

Tomcat 6 + JDK6 have some nice performance semantics especially with
non-blocking IO, persistent connections, etc.

It is likely that it will run in Resin, though I haven't tried it.

It will also likely run in any of the Tomcat-based stuff (i.e. TC
Server from Spring Source, JBossAS from Red Hat)


-Andy

On Mon, Jun 15, 2009 at 2:25 PM, Mukerjee, Neiloy
(Neil)neil.muker...@alcatel-lucent.com wrote:
Having tried Tomcat and not come to much success upon the  
realization that I'm using Tomcat 5.5 for other projects I'm  
working on and that I would be best off using Tomcat 6 for Solr  
v1.3.0, I am in search of another possible container. What have  
people used successfully that would be a good starting point for me  
to try out?




John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: j...@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com



Re: indexing mutiple table

2009-03-26 Thread John Martyniak
You could probably create a type field in the index to indicate the  
task type.  And then use the task type plus the primary key from the  
db to create the Id within the index.  Would save you alot of on  
maintenance, and has a bunch benefits.


-John

On Mar 26, 2009, at 8:23 AM, Radha C. cra...@ceiindia.com wrote:


Giovanni,

Much Thanks for the reply.

We are having seperate set of tables for each task. So we are going to
provide different search based on the task. The tables of one task are
unrelated to tables of another task.


 _

From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com]
Sent: Thursday, March 26, 2009 5:51 PM
To: solr-user@lucene.apache.org; cra...@ceiindia.com
Subject: Re: indexing mutiple table


Hello,

that might be a solution although it is a maintenance nightmare...

Are all those tables completely unrelated? Meaning does each table  
produce a

totally different document?

Either or when you perform a search you must return a common document
(unless your client is able to distinguish between different  
documents and

create an ad hoc result).

Perhaps you should wait for an answer by one of those who really  
know about

this stuff...I am pretty new to Solr.

Cheers,
Giovanni


On 3/26/09, Radha C. cra...@ceiindia.com wrote:

Thanks for your reply.

If I want to search the my data spread over many tables say more  
than 50

tables, then I have to setup that many cores ?

_

From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com]
Sent: Thursday, March 26, 2009 5:04 PM
To: solr-user@lucene.apache.org; cra...@ceiindia.com
Subject: Re: indexing mutiple table


Hello,

I believe you should use 2 different indexes, 2 different cores and  
write a
custom request handler or any other client that forwards the query  
to the

cores and merge the results.

Cheers,
Giovanni


On 3/26/09, Radha C. cra...@ceiindia.com wrote:

Hi,

I am trying to index different tables with different primary keys and
different fields.

Table A - primary field is a_id
Table B - primary fiedls is b_id

How to specify two different primary keys for two different tables in
schema.xml?

Is it possible to create a data-config with different root
entities/documents and index/search everything?

Thanks in advance.









Summing the results in a collapse

2009-01-12 Thread John Martyniak
I have been using the Collapse extension, and have it working pretty  
good.


However I would like to find out if there is a way to show the  
collapsed results, and then sum up a field of one of the remaining  
results.  For example


I display Result 1, (There 20 results, totalling $50.00).  Where the  
20 would be the number of items returned from the collapse, and the  
$50.00 would be the sum fee field in the 20 collapsed results.


Any help would be greatly appreciated.

Thank you,

-John




Multiple result fields in a collapse or subquery

2009-01-12 Thread John Martyniak
Is there anyway to have multiple collapse.field directives in the  
search string?


What I am trying to accomplish is the following

Result 1 (20 results)
EU (5 results)
USD (15 results)

Result 2 (10 results)
EU (5 results)
USD (5 results)

I thought that this could be done with faceting but with faceting you  
get the sum total for each keyword.  So for the above I get:

EU (10 results)
USD (20 results)

Which works well guiding a search, in to deeper more meaningful results.

However I would like have additional data that is tailored to each  
result row.


Any help would be greatly appreciated.

Thank you,

-John



Field Collapse Install Issue.

2008-12-17 Thread John Martyniak

Hi everybody,

So I have applied the Ivans latest patch to a clean 1.3.

I built it using 'ant compile' and 'ant dist', got the solr build war  
file.


Moved that into the Tomcat directory.

Modified my solrconfig.xml to include the following:
  searchComponent name=collapse  
class=org.apache.solr.handler.component.CollapseComponent /

   arr name=components
 strquery/str
 strfacet/str
 strmlt/str
 strhighlight/str
 strdebug/str
 strcollapse/str
   /arr
   arr name=first-components
 strmyFirstComponentName/str
 strcollapse/str
   /arr

thinking that everything should work correctly I did a search with the  
following:

http://localhost:8080/solr/select/?q=mikaversion=2.2start=0rows=10indent=oncollapse=truecollapse.field=type

I see the query parameters captured in the responseHeaders section,  
but I don't see a collapse section.


Does anybody have any ideas?

Any help would be greatly appreciated.

Thank you,

-John


Getting Field Collapsing working

2008-12-15 Thread John Martyniak

Hi everybody,

So I have applied the Ivans latest patch to a clean 1.3.

I built it using 'ant compile' and 'ant dist', got the solr build.war  
file.


Moved that into the Tomcat directory.

Modified my solrconfig.xml to include the following:
   searchComponent name=collapse  
class=org.apache.solr.handler.component.CollapseComponent /

arr name=components
  strquery/str
  strfacet/str
  strmlt/str
  strhighlight/str
  strdebug/str
  strcollapse/str
/arr
arr name=first-components
  strmyFirstComponentName/str
  strcollapse/str
/arr

thinking that everything should work correctly I did a search with the  
following:

http://localhost:8080/solr/select/?q=mikaversion=2.2start=0rows=10indent=oncollapse=truecollapse.field=type

I see the query parameters captured in the responseHeaders section,  
but I don't see a collapse section.


Does anybody have any ideas?

Any help would be greatly appreciated.

Thank you,

-John



Re: Applying Field Collapsing Patch

2008-12-12 Thread John Martyniak

That worked perfectly!!!

Thank you.

I wonder why it didn't work in the same way off the downloaded build.

-John

On Dec 11, 2008, at 9:40 PM, Doug Steigerwald wrote:

Have you tried just checking out (or exporting) the source from SVN  
and applying the patch?  Works fine for me that way.


$ svn co http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.3.0 
 solr-1.3.0
$ cd solr-1.3.0 ; patch -p0  ~/Downloads/collapsing-patch-to-1.3.0- 
ivan_2.patch


Doug

On Dec 11, 2008, at 3:50 PM, John Martyniak wrote:

It was a completely clean install.  I downloaded it from one of  
mirrors right before applying the patch to it.


Very troubling.  Any other suggestions or ideas?

I am running it on Mac OS Maybe I will try looking for some answers  
around that.


-John

On Dec 11, 2008, at 3:05 PM, Stephen Weiss swe...@stylesight.com  
wrote:


Yes, only ivan patch 2 (and before, only ivan patch 1), my sense  
was these patches were meant to be used in isolation (there were  
no notes saying to apply any other patches first).


Are you using patches for any other purpose (non-SOLR-236)?  Maybe  
you need to apply this one first, then those patches.  For me  
using any patch makes me nervous (we have a pretty strict policy  
about using beta code anywhere), I'm only doing it this once  
because it's absolutely necessary to provide the functionality  
desired.


--
Steve

On Dec 11, 2008, at 2:53 PM, John Martyniak wrote:


thanks for the advice.

I just downloaded a completely clean version, haven't even tried  
to build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are  
you running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time  
I've applied his patch I grab a fresh copy of the tarball and  
run the exact same command, it always works for me.


Now, whether the collapsing actually works is a different  
matter...


--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3  
(not a nightly), and it continously fails.  I am using the  
following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/ 
SolrIndexSearcher.java

Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John












Re: Sum of Fields and Record Count

2008-12-11 Thread John Martyniak

Hi Otis,

Thanks for the info and help.  I started reading up about it (on  
Markmail, nice site), and it looks like there is some activity to put  
it into 1.4.  I will try and apply the patch, and see how that works.   
It seems like a couple of people are using it in a production  
environment already, with out grief.  So that is a good thing.


-John

On Dec 11, 2008, at 1:24 AM, Otis Gospodnetic wrote:


Hi John,

It's not in the current release, but the chances are it will make it  
into 1.4.  You can try one of the recent patches and apply it to  
your Solr 1.3 sources.  Check list archives for more discussion,  
this field collapsing was just discussed again today/yesterday.   
markmail.org is a good one.



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: John Martyniak [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, December 10, 2008 10:51:57 PM
Subject: Re: Sum of Fields and Record Count

Otis,

Thanks for the information.  It looks like the field collapsing is  
similar to
what I am looking.  But is that in the current release?  Is it  
stable?


Is there anyway to do it in Solr 1.3?

-John

On Dec 10, 2008, at 9:59 PM, Otis Gospodnetic wrote:


Hi John,

This sounds a lot like field collapsing functionality that a few  
people are

working on in SOLR-236:


https://issues.apache.org/jira/browse/SOLR-236

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: John Martyniak
To: solr-user@lucene.apache.org
Sent: Wednesday, December 10, 2008 6:16:21 PM
Subject: Sum of Fields and Record Count

Hi,

I am a new solr user.

I have an application that I would like to show the results but  
one result

may
be the part of larger set of results.  So for example result #1  
might also

have

10 other results that are part of the same data set.

Hopefully this makes sense.

What I would like to find out is if there is a way within Solr to  
show the
result that matched with the query, and then to also show that  
this result is

part of a collection of 10 items.

I have thought about doing it using some sort of external process  
that runs,

and
with doing multiple queries, so get the list of items and then  
query against

each item.  But those don't seem elegant.

So I would like to find out if there is a way to do it within  
Solr that is a
little more elegant, and hopefully without having to write  
additional code.


Thank you in advance for the help.

-John








Building Solr from Source

2008-12-11 Thread John Martyniak

Hi,

I have downloaded Maven 2.0.9, and tried to build using mvn clean  
install and mvn install, nothing works.


Can somebody tell me how to build solr from source?  I am trying to  
build the 1.3 source.


thank you very much,

-John


Re: Building Solr from Source

2008-12-11 Thread John Martyniak
My mistake I saw the maven directories and did not see the build.xml  
in the src directory so just assumed...My Bad.


Anyway built successfully, thanks.

Now to apply the field collapsing patch.

-John

On Dec 11, 2008, at 8:46 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:



Solr uses ant for build
install ant

On Thu, Dec 11, 2008 at 7:13 PM, John Martyniak  
[EMAIL PROTECTED] wrote:

Hi,

I have downloaded Maven 2.0.9, and tried to build using mvn clean  
install

and mvn install, nothing works.

Can somebody tell me how to build solr from source?  I am trying to  
build

the 1.3 source.

thank you very much,

-John





--
--Noble Paul




Applying Field Collapsing Patch

2008-12-11 Thread John Martyniak

Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3 (not a  
nightly), and it continously fails.  I am using the following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all files  
directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/apache/ 
solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/ 
solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/ 
solr/search/SolrIndexSearcher.java.rej

patching file src/java/org/apache/solr/common/params/CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John


Re: Applying Field Collapsing Patch

2008-12-11 Thread John Martyniak

thanks for the advice.

I just downloaded a completely clean version, haven't even tried to  
build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time I've  
applied his patch I grab a fresh copy of the tarball and run the  
exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3 (not  
a nightly), and it continously fails.  I am using the following  
command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John






Re: Applying Field Collapsing Patch

2008-12-11 Thread John Martyniak
It was a completely clean install.  I downloaded it from one of  
mirrors right before applying the patch to it.


Very troubling.  Any other suggestions or ideas?

I am running it on Mac OS Maybe I will try looking for some answers  
around that.


-John

On Dec 11, 2008, at 3:05 PM, Stephen Weiss swe...@stylesight.com  
wrote:


Yes, only ivan patch 2 (and before, only ivan patch 1), my sense was  
these patches were meant to be used in isolation (there were no  
notes saying to apply any other patches first).


Are you using patches for any other purpose (non-SOLR-236)?  Maybe  
you need to apply this one first, then those patches.  For me using  
any patch makes me nervous (we have a pretty strict policy about  
using beta code anywhere), I'm only doing it this once because it's  
absolutely necessary to provide the functionality desired.


--
Steve

On Dec 11, 2008, at 2:53 PM, John Martyniak wrote:


thanks for the advice.

I just downloaded a completely clean version, haven't even tried to  
build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time I've  
applied his patch I grab a fresh copy of the tarball and run the  
exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3  
(not a nightly), and it continously fails.  I am using the  
following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/ 
SolrIndexSearcher.java

Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John








Sum of Fields and Record Count

2008-12-10 Thread John Martyniak

Hi,

I am a new solr user.

I have an application that I would like to show the results but one  
result may be the part of larger set of results.  So for example  
result #1 might also have 10 other results that are part of the same  
data set.


Hopefully this makes sense.

What I would like to find out is if there is a way within Solr to show  
the result that matched with the query, and then to also show that  
this result is part of a collection of 10 items.


I have thought about doing it using some sort of external process that  
runs, and with doing multiple queries, so get the list of items and  
then query against each item.  But those don't seem elegant.


So I would like to find out if there is a way to do it within Solr  
that is a little more elegant, and hopefully without having to write  
additional code.


Thank you in advance for the help.

-John




Re: Sum of Fields and Record Count

2008-12-10 Thread John Martyniak

Grant,

Basically I have created a text field that has the grouping value.   
All of the records would have the same value in this text field.  This  
is accomplished with some pre-processing. When I capture the data, but  
before it is submitted into the index.



-John

On Dec 10, 2008, at 8:46 PM, Grant Ingersoll [EMAIL PROTECTED]  
wrote:



Hi John,

What is your process for determining that #1 is part of the other  
result set?  My gut says this is a faceting problem, i.e. #1 has a  
field contain its category that is also shared by the 10 other  
results, and that all you need to do is facet on the category field.


The other thing that comes to mind is More Like This: 
http://wiki.apache.org/solr/MoreLikeThis

-Grant

On Dec 10, 2008, at 6:16 PM, John Martyniak wrote:


Hi,

I am a new solr user.

I have an application that I would like to show the results but one  
result may be the part of larger set of results.  So for example  
result #1 might also have 10 other results that are part of the  
same data set.


Hopefully this makes sense.

What I would like to find out is if there is a way within Solr to  
show the result that matched with the query, and then to also show  
that this result is part of a collection of 10 items.


I have thought about doing it using some sort of external process  
that runs, and with doing multiple queries, so get the list of  
items and then query against each item.  But those don't seem  
elegant.


So I would like to find out if there is a way to do it within Solr  
that is a little more elegant, and hopefully without having to  
write additional code.


Thank you in advance for the help.

-John




--
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












Re: Sum of Fields and Record Count

2008-12-10 Thread John Martyniak

Grant,

For the more like this that would show the grouped results, once you  
have clicked on the item, so basically making another query,  would it  
show a count of the more like this results?


Something like cxxc and a collection 10 other items.

-John

On Dec 10, 2008, at 8:46 PM, Grant Ingersoll [EMAIL PROTECTED]  
wrote:



Hi John,

What is your process for determining that #1 is part of the other  
result set? My gut says this is a faceting problem, i.e. #1 has a  
field contain its category that is also shared by the 10 other  
results, and that all you need to do is facet on the category field.


The other thing that comes to mind is More Like This: 
http://wiki.apache.org/solr/MoreLikeThis

-Grant

On Dec 10, 2008, at 6:16 PM, John Martyniak wrote:


Hi,

I am a new solr user.

I have an application that I would like to show the results but one  
result may be the part of larger set of results.  So for example  
result #1 might also have 10 other results that are part of the  
same data set.


Hopefully this makes sense.

What I would like to find out is if there is a way within Solr to  
show the result that matched with the query, and then to also show  
that this result is part of a collection of 10 items.


I have thought about doing it using some sort of external process  
that runs, and with doing multiple queries, so get the list of  
items and then query against each item.  But those don't seem  
elegant.


So I would like to find out if there is a way to do it within Solr  
that is a little more elegant, and hopefully without having to  
write additional code.


Thank you in advance for the help.

-John




--
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












Re: Sum of Fields and Record Count

2008-12-10 Thread John Martyniak

Otis,

Thanks for the information.  It looks like the field collapsing is  
similar to what I am looking.  But is that in the current release?  Is  
it stable?


Is there anyway to do it in Solr 1.3?

-John

On Dec 10, 2008, at 9:59 PM, Otis Gospodnetic wrote:


Hi John,

This sounds a lot like field collapsing functionality that a few  
people are working on in SOLR-236:


https://issues.apache.org/jira/browse/SOLR-236

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: John Martyniak [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, December 10, 2008 6:16:21 PM
Subject: Sum of Fields and Record Count

Hi,

I am a new solr user.

I have an application that I would like to show the results but one  
result may
be the part of larger set of results.  So for example result #1  
might also have

10 other results that are part of the same data set.

Hopefully this makes sense.

What I would like to find out is if there is a way within Solr to  
show the
result that matched with the query, and then to also show that this  
result is

part of a collection of 10 items.

I have thought about doing it using some sort of external process  
that runs, and
with doing multiple queries, so get the list of items and then  
query against

each item.  But those don't seem elegant.

So I would like to find out if there is a way to do it within Solr  
that is a
little more elegant, and hopefully without having to write  
additional code.


Thank you in advance for the help.

-John






Searchable/indexable newsgroups

2008-11-19 Thread John Martyniak
Does anybody know of a good way to index newsgroups using SOLR?   
Basically would like to build a searchable list of newsgroup content.


Any help would be greatly appreciated.

-John



Solr for Whole Web Search

2008-10-22 Thread John Martyniak

I am very new to Solr, but I have played with Nutch and Lucene.

Has anybody used Solr for a whole web indexing application?

Which Spider did you use?

How does it compare to Nutch?

Thanks in advance for all of the info.

-John



Re: Solr for Whole Web Search

2008-10-22 Thread John Martyniak

Grant thanks for the response.

A couple of other people have recommended trying the Nutch + Solr  
approach, but I am not sure what the real benefit of doing that is.   
Since Nutch provides most of the same features as Solr and Solr has  
some nice additional features (like spell checking, incremental index).


So I currently have a Nutch Index of around 500,000+ Urls, but expect  
it to get much bigger.  And am generally pretty happy with it, but I  
just want to make sure that I am going down the correct path, for the  
best feature set.  As far as implementation to the front end is  
concerned, I have been using the Nutch search app as basically a  
webservice to feed the main app (So using RSS).  The main app takes  
that and manipulates the results for display.


As far as the Hadoop + Lucene integration, I haven't used that  
directly just the Hadoop integration with Nutch.  And of course Hadoop  
independently.


-John


On Oct 22, 2008, at 10:08 AM, Grant Ingersoll wrote:



On Oct 22, 2008, at 7:57 AM, John Martyniak wrote:


I am very new to Solr, but I have played with Nutch and Lucene.

Has anybody used Solr for a whole web indexing application?

Which Spider did you use?

How does it compare to Nutch?


There is a patch that combines Nutch + Solr.  Nutch is used for  
crawling, Solr for searching.  Can't say I've used it for whole web  
searching, but I believe some are trying it.


At the end of the day, I'm sure Solr could do it, but it will take  
some work to setup the architecture (distributed, replicated) and  
deal properly with fault tolerance and fail over.There are also  
some examples on Hadoop about Hadoop + Lucene integration.


How big are you talking?




Thanks in advance for all of the info.

-John



--
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ













Re: Index updates blocking readers: To Multicore or not?

2008-10-22 Thread John Martyniak

Jim,

This is a off topic question.

But for your 30M documents, did you fetch those from external web  
sites (Whole Web Search)?  Or are they internal documents?  If they  
are external what method did you use to fetch them and which spider?


I am in the process of deciding between using Nutch for whole web  
indexing, Solr + Spider?, or Nutch + Solr, etc.


Thank you in advance for your insight into this issue.

-John

On Oct 22, 2008, at 10:55 AM, Jim Murphy wrote:



Thanks Yonik,

I have more information...

1. We do indeed have large indexes: 40GB on disk, 30M documents -  
and is

just a test server we have 8 of these in parallel.

2. The performance problem I was seeing followed replication, and  
first
query on a new searcher.  It turns out we didn't configure index  
warming
queries very well so we removes the various solr rocks type  
queries to one
that was better for our data - and had not improvement.  The problem  
was
that replication completed, a new searcher was created and  
registered but
the first query qould take 10-20 seconds to complete.  There after  
it took

200 milliseconds for similar non-cached queries.

Profiler pointed us to building the FieldSortedHitQueue was taking  
all the
time.  Our warming query did not include a sort but our queries  
commonly do.
Once we added the sort parameter our warming query started taking  
the 10-20
seconds prior to registering the searcher.  After that the first  
query on

the new searcher took the expected 200ms.

LESSON LEARNED: warm your caches! And, if a sort is involved in your  
queries
incorporate that sort in your warming query!  Add a warming query  
for each

kind of sort that you expect to do.









Yonik Seeley wrote:


On Mon, Oct 6, 2008 at 2:10 PM, Jim Murphy [EMAIL PROTECTED]  
wrote:
We have a farm of several Master-Slave pairs all managing a single  
very

large
logical index sharded across the master-slaves.  We notice on the
slaves,
after an rsync update, as the index is being committed that all  
queries

are
blocked sometimes resulting in unacceptable service times.  I'm  
looking

at
ways we can manage these update burps.


Updates should never block queries.
What version of Solr are you using?
Is it possible that your indexes are so big, opening a new index in
the background causes enough of the old index to be flushed from OS
cache, causing big slowdowns?

-Yonik



Question #1: Anything obvious I can tweak in the configuration to
mitigate
these multi-second blocking updates?  Our Indexes are 40GB, 20M  
documents
each.  RSync updates are every 5 minutes several hundred KB per  
update.


Question #2: I'm considering setting up each slave with multiple  
Solr

cores.
The 2 indexes per instance would be nearly identical copies but  
A would

be
read from while B is being updated, then they would swap.  I'll  
have to
figure out how to rsync these 2 indexes properly but if I can get  
the
commits to happen to the offline index then I suspect my queries  
could

proceed unblocked.

Is this the wrong tree to be barking up?  Any other thoughts?

Thanks in advance,

Jim



--
View this message in context:
http://www.nabble.com/Index-updates-blocking-readers%3A-To-Multicore-or-not--tp19843098p19843098.html
Sent from the Solr - User mailing list archive at Nabble.com.







--
View this message in context: 
http://www.nabble.com/Index-updates-blocking-readers%3A-To-Multicore-or-not--tp19843098p20112546.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Index updates blocking readers: To Multicore or not?

2008-10-22 Thread John Martyniak
Thank you that is good information, as that is kind of way that I am  
leaning.


So when you fetch the content from RSS, does that get rendered to an  
XML document that Solr indexes?


Also what where a couple of decision points for using Solr as opposed  
to using Nutch, or even straight Lucene?


-John



On Oct 22, 2008, at 11:22 AM, Jim Murphy wrote:



We index RSS content using our own home grown distributed spiders -  
not using
Nutch.  We use ruby processes do do the feed fetching and XML  
shreading, and

Amazon SQS to queue up work packets to insert into our Solr cluster.

Sorry can't be of more help.

--
View this message in context: 
http://www.nabble.com/Index-updates-blocking-readers%3A-To-Multicore-or-not--tp19843098p20113143.html
Sent from the Solr - User mailing list archive at Nabble.com.