Re: SolrClient.query take a 'collection' argument

2020-06-06 Thread Jim Anderson
Erick,

Thanks for the clarification on the JVM heap space. I will invoke java as
you advise.

The program that I am writing is a java example that I took off the
internet. The intent of the example is to read an existing core stored in
solr. I created the core using instructions that I found in a tutorial. I
think the example from the tutorial worked ok, because I can see the core
in solr that was created using nutch. So I think my status is that I have a
good core, and I was trying to read and print out the documents in that
core.

My current plan is to try to find and intall Nutch 1.17 and then clear and
reinstall solr 8.5.1 and start over again with a clean slate.

Regards,
Jim


On Sat, Jun 6, 2020 at 10:25 AM Erick Erickson 
wrote:

> I’m not talking about how much memory your machine has,
> the critical bit it’s how much heap space is allocated to the
> JVM to run your app.
>
> You can increase it by specifying -Xmx2G say when you
> invoke Java.
>
> The version difference is suspicious indeed. I’m a little
> confused here. Exactly _what_ program is crashing? An
> independent app you wrote or nutch? If the former, you could
> try compiling your Java app against the Solr jars provided
> with the Solr version that ships with Nutch 1.16 (Solr 7.3.1?).
>
> Best,
> Erick
>
> > On Jun 6, 2020, at 9:30 AM, Jim Anderson 
> wrote:
> >
> > Erick,
> >
> > Thanks for the suggestion. I will keep it in the back of my mind for now.
> > My PC has 8 G-bytes of memory and has roughly 4 G-bytes in use.
> >
> > If the forefront, I'm looking at the recommended solr/nutch combinations.
> > I'm using Solr 8.5.1 with nutch 1.16. The recommendation is to use nutch
> > 1.17 with Solr 8.5.1, but 1.17 has not been released for download.
> > Consequently, I used nutch 1.16. I'm not sure that will make a
> difference,
> > but I am suspicious.
> >
> > Jim
> >
> > On Sat, Jun 6, 2020 at 9:18 AM Erick Erickson 
> > wrote:
> >
> >> I’d look for an OutOfMemory problem before going too much farther.
> >> The simplest way to see if that’s in the right direction would be to
> >> run your SolrJ program with a massive memory size. Perhaps monitor
> >> your program with jconsole or similar to see if there’s any clues about
> >> memory usage.
> >>
> >> OOMs lead to unpredictable behavior, so it’s at least a possibility that
> >> this is the root cause. If so, there’s nothing SolrJ can do about it
> >> exactly
> >> because the state of a program is indeterminate afterwards, even if the
> >> OOM is caught somewhere. I suppose you could also try to catch that
> >> exception in the top-level of your program.
> >>
> >> I’m assuming a stand-alone program here, if you’re running some custom
> >> code in Solr itself, make sure the oom-killer script is running.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Jun 6, 2020, at 8:23 AM, Jim Anderson 
> >> wrote:
> >>>
> >>> Shawn,
> >>>
> >>> Thanks for the explanation. Very good response.
> >>>
> >>> The first paragraph helped clarify what a collection is. I have read
> >> quite
> >>> about about Solr. There is so much to absorb that it is slowly sinking
> >> in.
> >>> Your 2nd paragraph definitely answered my question, i.e. passing a core
> >>> name should be ok when a collection name is specified as a method
> >> argument.
> >>> This is what I did.
> >>>
> >>> Regarding the 3rd paragraph, it is good to know that Solrj is fairly
> >> robust
> >>> and should not be crashing. Nevertheless, that is what is happening.
> The
> >>> call to client.query() is wrapped in a try/catch sequence. Apparently
> no
> >>> exceptions were detected, or the program crashed before the exception
> >> could
> >>> be raised.
> >>>
> >>> My next step is to check where I can report this to the Solr folks and
> >> see
> >>> if they can figure out what it is crashing. BTW, I had not checked my
> >>> output file before this morning. The output file indicates that the
> >> program
> >>> ran to completion, so I am guessing that at least one other thread is
> >> being
> >>> created and that that  thread is crashing.
> >>>
> >>> Regards,
> >>> Jim
> >>>
> >>> On Fri, Jun 5, 2020 at 10:52 PM Shawn Heisey 
> >> wrote:
> >>>
>  On 6/5/2020 4:24 PM, Jim Anderson wrote:
> > I am running my first solrj program and it is crashing when I call
> the
> > method
> >
> > client.query("coreName",queryParms)
> >
> > The API doc says the string should be a collection. I'm still not
> sure
> > about the difference between a collection and a core, so what I am
> >> doing
>  is
> > likely illegal. Given that I have created a core, create a collection
>  from
> > it so that I can truly pass a collection name to the query function?
> 
>  The concept of a collection comes from SolrCloud.  A collection is
> made
>  up of one or more shards.  A shard is made up of one or more replicas.
>  Each replica is a core.  If you're not running SolrCloud, then you do
>  not have collections.
> 
>  

Re: SolrClient.query take a 'collection' argument

2020-06-06 Thread Erick Erickson
I’m not talking about how much memory your machine has, 
the critical bit it’s how much heap space is allocated to the
JVM to run your app.

You can increase it by specifying -Xmx2G say when you 
invoke Java.

The version difference is suspicious indeed. I’m a little 
confused here. Exactly _what_ program is crashing? An
independent app you wrote or nutch? If the former, you could
try compiling your Java app against the Solr jars provided
with the Solr version that ships with Nutch 1.16 (Solr 7.3.1?).

Best,
Erick

> On Jun 6, 2020, at 9:30 AM, Jim Anderson  wrote:
> 
> Erick,
> 
> Thanks for the suggestion. I will keep it in the back of my mind for now.
> My PC has 8 G-bytes of memory and has roughly 4 G-bytes in use.
> 
> If the forefront, I'm looking at the recommended solr/nutch combinations.
> I'm using Solr 8.5.1 with nutch 1.16. The recommendation is to use nutch
> 1.17 with Solr 8.5.1, but 1.17 has not been released for download.
> Consequently, I used nutch 1.16. I'm not sure that will make a difference,
> but I am suspicious.
> 
> Jim
> 
> On Sat, Jun 6, 2020 at 9:18 AM Erick Erickson 
> wrote:
> 
>> I’d look for an OutOfMemory problem before going too much farther.
>> The simplest way to see if that’s in the right direction would be to
>> run your SolrJ program with a massive memory size. Perhaps monitor
>> your program with jconsole or similar to see if there’s any clues about
>> memory usage.
>> 
>> OOMs lead to unpredictable behavior, so it’s at least a possibility that
>> this is the root cause. If so, there’s nothing SolrJ can do about it
>> exactly
>> because the state of a program is indeterminate afterwards, even if the
>> OOM is caught somewhere. I suppose you could also try to catch that
>> exception in the top-level of your program.
>> 
>> I’m assuming a stand-alone program here, if you’re running some custom
>> code in Solr itself, make sure the oom-killer script is running.
>> 
>> Best,
>> Erick
>> 
>>> On Jun 6, 2020, at 8:23 AM, Jim Anderson 
>> wrote:
>>> 
>>> Shawn,
>>> 
>>> Thanks for the explanation. Very good response.
>>> 
>>> The first paragraph helped clarify what a collection is. I have read
>> quite
>>> about about Solr. There is so much to absorb that it is slowly sinking
>> in.
>>> Your 2nd paragraph definitely answered my question, i.e. passing a core
>>> name should be ok when a collection name is specified as a method
>> argument.
>>> This is what I did.
>>> 
>>> Regarding the 3rd paragraph, it is good to know that Solrj is fairly
>> robust
>>> and should not be crashing. Nevertheless, that is what is happening. The
>>> call to client.query() is wrapped in a try/catch sequence. Apparently no
>>> exceptions were detected, or the program crashed before the exception
>> could
>>> be raised.
>>> 
>>> My next step is to check where I can report this to the Solr folks and
>> see
>>> if they can figure out what it is crashing. BTW, I had not checked my
>>> output file before this morning. The output file indicates that the
>> program
>>> ran to completion, so I am guessing that at least one other thread is
>> being
>>> created and that that  thread is crashing.
>>> 
>>> Regards,
>>> Jim
>>> 
>>> On Fri, Jun 5, 2020 at 10:52 PM Shawn Heisey 
>> wrote:
>>> 
 On 6/5/2020 4:24 PM, Jim Anderson wrote:
> I am running my first solrj program and it is crashing when I call the
> method
> 
> client.query("coreName",queryParms)
> 
> The API doc says the string should be a collection. I'm still not sure
> about the difference between a collection and a core, so what I am
>> doing
 is
> likely illegal. Given that I have created a core, create a collection
 from
> it so that I can truly pass a collection name to the query function?
 
 The concept of a collection comes from SolrCloud.  A collection is made
 up of one or more shards.  A shard is made up of one or more replicas.
 Each replica is a core.  If you're not running SolrCloud, then you do
 not have collections.
 
 Wherever SolrJ docs says "collection" as a parameter for a request, it
 is likely that you can think "core" instead and have it still be
 correct.  If you're running SolrCloud, you'll want to be very careful to
 know the difference.
 
 It seems very odd for a SolrJ query to cause the program to crash.  It
 would be pretty common for it to throw an exception, but that's not the
 same as a crash, unless exception handling is incorrect or missing.
 
 Thanks,
 Shawn
 
>> 
>> 



Re: SolrClient.query take a 'collection' argument

2020-06-06 Thread Jim Anderson
Erick,

Thanks for the suggestion. I will keep it in the back of my mind for now.
My PC has 8 G-bytes of memory and has roughly 4 G-bytes in use.

If the forefront, I'm looking at the recommended solr/nutch combinations.
I'm using Solr 8.5.1 with nutch 1.16. The recommendation is to use nutch
1.17 with Solr 8.5.1, but 1.17 has not been released for download.
Consequently, I used nutch 1.16. I'm not sure that will make a difference,
but I am suspicious.

Jim

On Sat, Jun 6, 2020 at 9:18 AM Erick Erickson 
wrote:

> I’d look for an OutOfMemory problem before going too much farther.
> The simplest way to see if that’s in the right direction would be to
> run your SolrJ program with a massive memory size. Perhaps monitor
> your program with jconsole or similar to see if there’s any clues about
> memory usage.
>
> OOMs lead to unpredictable behavior, so it’s at least a possibility that
> this is the root cause. If so, there’s nothing SolrJ can do about it
> exactly
> because the state of a program is indeterminate afterwards, even if the
> OOM is caught somewhere. I suppose you could also try to catch that
> exception in the top-level of your program.
>
> I’m assuming a stand-alone program here, if you’re running some custom
> code in Solr itself, make sure the oom-killer script is running.
>
> Best,
> Erick
>
> > On Jun 6, 2020, at 8:23 AM, Jim Anderson 
> wrote:
> >
> > Shawn,
> >
> > Thanks for the explanation. Very good response.
> >
> > The first paragraph helped clarify what a collection is. I have read
> quite
> > about about Solr. There is so much to absorb that it is slowly sinking
> in.
> > Your 2nd paragraph definitely answered my question, i.e. passing a core
> > name should be ok when a collection name is specified as a method
> argument.
> > This is what I did.
> >
> > Regarding the 3rd paragraph, it is good to know that Solrj is fairly
> robust
> > and should not be crashing. Nevertheless, that is what is happening. The
> > call to client.query() is wrapped in a try/catch sequence. Apparently no
> > exceptions were detected, or the program crashed before the exception
> could
> > be raised.
> >
> > My next step is to check where I can report this to the Solr folks and
> see
> > if they can figure out what it is crashing. BTW, I had not checked my
> > output file before this morning. The output file indicates that the
> program
> > ran to completion, so I am guessing that at least one other thread is
> being
> > created and that that  thread is crashing.
> >
> > Regards,
> > Jim
> >
> > On Fri, Jun 5, 2020 at 10:52 PM Shawn Heisey 
> wrote:
> >
> >> On 6/5/2020 4:24 PM, Jim Anderson wrote:
> >>> I am running my first solrj program and it is crashing when I call the
> >>> method
> >>>
> >>> client.query("coreName",queryParms)
> >>>
> >>> The API doc says the string should be a collection. I'm still not sure
> >>> about the difference between a collection and a core, so what I am
> doing
> >> is
> >>> likely illegal. Given that I have created a core, create a collection
> >> from
> >>> it so that I can truly pass a collection name to the query function?
> >>
> >> The concept of a collection comes from SolrCloud.  A collection is made
> >> up of one or more shards.  A shard is made up of one or more replicas.
> >> Each replica is a core.  If you're not running SolrCloud, then you do
> >> not have collections.
> >>
> >> Wherever SolrJ docs says "collection" as a parameter for a request, it
> >> is likely that you can think "core" instead and have it still be
> >> correct.  If you're running SolrCloud, you'll want to be very careful to
> >> know the difference.
> >>
> >> It seems very odd for a SolrJ query to cause the program to crash.  It
> >> would be pretty common for it to throw an exception, but that's not the
> >> same as a crash, unless exception handling is incorrect or missing.
> >>
> >> Thanks,
> >> Shawn
> >>
>
>


Re: SolrClient.query take a 'collection' argument

2020-06-06 Thread Erick Erickson
I’d look for an OutOfMemory problem before going too much farther.
The simplest way to see if that’s in the right direction would be to
run your SolrJ program with a massive memory size. Perhaps monitor
your program with jconsole or similar to see if there’s any clues about
memory usage.

OOMs lead to unpredictable behavior, so it’s at least a possibility that
this is the root cause. If so, there’s nothing SolrJ can do about it exactly
because the state of a program is indeterminate afterwards, even if the
OOM is caught somewhere. I suppose you could also try to catch that
exception in the top-level of your program.

I’m assuming a stand-alone program here, if you’re running some custom
code in Solr itself, make sure the oom-killer script is running.

Best,
Erick

> On Jun 6, 2020, at 8:23 AM, Jim Anderson  wrote:
> 
> Shawn,
> 
> Thanks for the explanation. Very good response.
> 
> The first paragraph helped clarify what a collection is. I have read quite
> about about Solr. There is so much to absorb that it is slowly sinking in.
> Your 2nd paragraph definitely answered my question, i.e. passing a core
> name should be ok when a collection name is specified as a method argument.
> This is what I did.
> 
> Regarding the 3rd paragraph, it is good to know that Solrj is fairly robust
> and should not be crashing. Nevertheless, that is what is happening. The
> call to client.query() is wrapped in a try/catch sequence. Apparently no
> exceptions were detected, or the program crashed before the exception could
> be raised.
> 
> My next step is to check where I can report this to the Solr folks and see
> if they can figure out what it is crashing. BTW, I had not checked my
> output file before this morning. The output file indicates that the program
> ran to completion, so I am guessing that at least one other thread is being
> created and that that  thread is crashing.
> 
> Regards,
> Jim
> 
> On Fri, Jun 5, 2020 at 10:52 PM Shawn Heisey  wrote:
> 
>> On 6/5/2020 4:24 PM, Jim Anderson wrote:
>>> I am running my first solrj program and it is crashing when I call the
>>> method
>>> 
>>> client.query("coreName",queryParms)
>>> 
>>> The API doc says the string should be a collection. I'm still not sure
>>> about the difference between a collection and a core, so what I am doing
>> is
>>> likely illegal. Given that I have created a core, create a collection
>> from
>>> it so that I can truly pass a collection name to the query function?
>> 
>> The concept of a collection comes from SolrCloud.  A collection is made
>> up of one or more shards.  A shard is made up of one or more replicas.
>> Each replica is a core.  If you're not running SolrCloud, then you do
>> not have collections.
>> 
>> Wherever SolrJ docs says "collection" as a parameter for a request, it
>> is likely that you can think "core" instead and have it still be
>> correct.  If you're running SolrCloud, you'll want to be very careful to
>> know the difference.
>> 
>> It seems very odd for a SolrJ query to cause the program to crash.  It
>> would be pretty common for it to throw an exception, but that's not the
>> same as a crash, unless exception handling is incorrect or missing.
>> 
>> Thanks,
>> Shawn
>> 



Re: SolrClient.query take a 'collection' argument

2020-06-06 Thread Jim Anderson
Shawn,

Thanks for the explanation. Very good response.

The first paragraph helped clarify what a collection is. I have read quite
about about Solr. There is so much to absorb that it is slowly sinking in.
Your 2nd paragraph definitely answered my question, i.e. passing a core
name should be ok when a collection name is specified as a method argument.
This is what I did.

Regarding the 3rd paragraph, it is good to know that Solrj is fairly robust
and should not be crashing. Nevertheless, that is what is happening. The
call to client.query() is wrapped in a try/catch sequence. Apparently no
exceptions were detected, or the program crashed before the exception could
be raised.

My next step is to check where I can report this to the Solr folks and see
if they can figure out what it is crashing. BTW, I had not checked my
output file before this morning. The output file indicates that the program
ran to completion, so I am guessing that at least one other thread is being
created and that that  thread is crashing.

Regards,
Jim

On Fri, Jun 5, 2020 at 10:52 PM Shawn Heisey  wrote:

> On 6/5/2020 4:24 PM, Jim Anderson wrote:
> > I am running my first solrj program and it is crashing when I call the
> > method
> >
> > client.query("coreName",queryParms)
> >
> > The API doc says the string should be a collection. I'm still not sure
> > about the difference between a collection and a core, so what I am doing
> is
> > likely illegal. Given that I have created a core, create a collection
> from
> > it so that I can truly pass a collection name to the query function?
>
> The concept of a collection comes from SolrCloud.  A collection is made
> up of one or more shards.  A shard is made up of one or more replicas.
> Each replica is a core.  If you're not running SolrCloud, then you do
> not have collections.
>
> Wherever SolrJ docs says "collection" as a parameter for a request, it
> is likely that you can think "core" instead and have it still be
> correct.  If you're running SolrCloud, you'll want to be very careful to
> know the difference.
>
> It seems very odd for a SolrJ query to cause the program to crash.  It
> would be pretty common for it to throw an exception, but that's not the
> same as a crash, unless exception handling is incorrect or missing.
>
> Thanks,
> Shawn
>


Re: SolrClient.query take a 'collection' argument

2020-06-05 Thread Shawn Heisey

On 6/5/2020 4:24 PM, Jim Anderson wrote:

I am running my first solrj program and it is crashing when I call the
method

client.query("coreName",queryParms)

The API doc says the string should be a collection. I'm still not sure
about the difference between a collection and a core, so what I am doing is
likely illegal. Given that I have created a core, create a collection from
it so that I can truly pass a collection name to the query function?


The concept of a collection comes from SolrCloud.  A collection is made 
up of one or more shards.  A shard is made up of one or more replicas. 
Each replica is a core.  If you're not running SolrCloud, then you do 
not have collections.


Wherever SolrJ docs says "collection" as a parameter for a request, it 
is likely that you can think "core" instead and have it still be 
correct.  If you're running SolrCloud, you'll want to be very careful to 
know the difference.


It seems very odd for a SolrJ query to cause the program to crash.  It 
would be pretty common for it to throw an exception, but that's not the 
same as a crash, unless exception handling is incorrect or missing.


Thanks,
Shawn


SolrClient.query take a 'collection' argument

2020-06-05 Thread Jim Anderson
I am running my first solrj program and it is crashing when I call the
method

client.query("coreName",queryParms)

The API doc says the string should be a collection. I'm still not sure
about the difference between a collection and a core, so what I am doing is
likely illegal. Given that I have created a core, create a collection from
it so that I can truly pass a collection name to the query function?

Jim A.