Running Simple Streaming expressions in a loop through SolrJ stops with read timeout after a few iterations

2021-03-02 Thread ufuk yılmaz
I’m using the following example on Lucidworks to use streaming expressions from 
SolrJ:

https://lucidworks.com/post/streaming-expressions-in-solrj/

Problem is, when I run it inside a for loop, even the simplest expression 
(echo) stops executing after about 5 iterations. I thought the underlying 
HttpClient was not closing the tcp connection to the solr host, and after 4-5 
iterations it reaches the max connections per host limit of the OS (mine is 
windows 10) and stops working.

But then I tried to manually supply a SolrClientCache with a custom configured 
HttpClient, debugged and saw my custom HttpClient is being utilized by the 
stream, but whatever I tried it didn’t change the outcome.

Do you have any idea about this problem? Am I on the right track about 
HttpClient not closing-reusing a connection after an expression is finished? Or 
is there another issue?

I also tried this with different expressions but result didn’t change.

I created a gist to share my code here: https://git.io/Jqevp
but I’m pasting a shortened version here to read without going there: 

-
String workerUrl = "http://mySolrHost:8983/solr/WorkerCollection;;

String expr = "echo(x)";

for (int i = 0; i < 20; i++) {

TupleStream tplStream = null;

ModifiableSolrParams modifiableSolrParams =
new ModifiableSolrParams()
.set("expr", expr.replaceAll("x", Integer.toString(i)))
.set("preferLocalShards", true)
.set("qt", "/stream");

TupleStream tplStream = new SolrStream(workerUrl, modifiableSolrParams);

tplStream.setStreamContext(new StreamContext());

tplStream.open();

Tuple tuple;
tuple = tplStream.read();
System.out.println(tuple.fields);

tplStream.close();
}
-

Sent from Mail for Windows 10



Re: SolrJ: SolrInputDocument.addField()

2021-02-16 Thread Shawn Heisey

On 2/15/2021 10:17 AM, Steven White wrote:

Yes, I have managed schema enabled like so:

   
 true
 cp-schema.xml
   

The reason why I enabled it is so that I can dynamically customize the
schema based on what's in the DB.  So that I can add fields to the schema
dynamically.


A managed/mutable schema is a configuration detail that's separate from 
(and required by) the update processor that guesses unknown fields.  It 
has been the default schema factory used in out-of-the box 
configurations for quite a while.



I guess a better question, to meet my need, is this: how do I tell Solr, in
schema-less mode, to use *my* defined field-type whenever it needs to
create a new field?


The config for that is described here:

https://lucene.apache.org/solr/guide/8_6/schemaless-mode.html#enable-field-class-guessing

It is a bad idea to rely on field guessing for a production index.  Even 
the most carefully designed configuration cannot get it right every 
time.  You're very likely to run into situations where the software's 
best guess turns out to be wrong for your needs.  And then you're forced 
into what you should have done in the first place -- manually fixing the 
definition for that field, which usually also requires reindexing from 
scratch.


One counter-argument to what I stated in the last paragraph that 
frequently comes up is "my data is very well curated and consistent." 
But if that is the case, then you will know what fields and types are 
required *in advance* and you can easily construct a schema yourself 
before sending any data for indexing -- no guessing required.


Thanks,
Shawn


Re: SolrJ: SolrInputDocument.addField()

2021-02-16 Thread Jimi Hullegård
Hi Steven,

Just a thought, from someone who never have used schema-less mode: Have you 
considered using a regular schema file, with a bunch of dynamicField 
definitions? Then you can for example define a dynamic boolean field like this:



Then, when you index the data, you can append "_b" to the field name for all 
boolean values. So if you for example want to index searchable: true, then you 
send that data with the fieldname "searchable_b" and solr will index it as a 
Boolean field.

/Jimi

Steven White wrote:
>
> Hi Shawn,
>
> Yes, I have managed schema enabled like so:
>
>  
> true
> cp-schema.xml
>   
>
> The reason why I enabled it is so that I can dynamically customize the schema 
> based on what's in the DB.  So that I can add fields to the schema 
> dynamically.
>
> I didn't know about the field "guessing" part.  Now that I know I see this in 
> my solrconfig.xml file:
>
>default="${update.autoCreateFields:true}"
>
> processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">
> 
> 
> 
>   
>
> If I remove this block, what will happen?
>
> I guess a better question, to meet my need, is this: how do I tell Solr, in 
> schema-less mode, to use *my* defined field-type whenever it needs to create 
> a new field?
>
> I'm on Solr 8.6.1 and the link at
> https://lucene.apache.org/solr/guide/8_6/schema-factory-definition-in-solrconfig.html#schema-factory-definition-in-solrconfig
> doesn't offer much help.
>
> Thanks
>
> Steven
Svenskt Näringsliv är företagsamhetens röst i Sverige. Vi samverkar med 50 
arbetsgivar- och branschorganisationer och är den gemensamma rösten för 60 000 
företag med nästan 2 miljoner medarbetare. Vår uppgift är att tala för alla 
företag och branscher, även de som ännu inte finns men som kan uppstå om 
förutsättningarna är de rätta. Ett bättre företagsklimat för ett bättre 
Sverige. Det är vårt uppdrag.

Svenskt Näringsliv behandlar dina personuppgifter i enlighet med GDPR. Här kan 
du läsa mer om vår behandling och dina rättigheter, 
Integritetspolicy


Re: SolrJ: SolrInputDocument.addField()

2021-02-15 Thread Steven White
Hi Shawn,

Yes, I have managed schema enabled like so:

  
true
cp-schema.xml
  

The reason why I enabled it is so that I can dynamically customize the
schema based on what's in the DB.  So that I can add fields to the schema
dynamically.

I didn't know about the field "guessing" part.  Now that I know I see this
in my solrconfig.xml file:

  



  

If I remove this block, what will happen?

I guess a better question, to meet my need, is this: how do I tell Solr, in
schema-less mode, to use *my* defined field-type whenever it needs to
create a new field?

I'm on Solr 8.6.1 and the link at
https://lucene.apache.org/solr/guide/8_6/schema-factory-definition-in-solrconfig.html#schema-factory-definition-in-solrconfig
doesn't offer much help.

Thanks

Steven


On Mon, Feb 15, 2021 at 11:09 AM Shawn Heisey  wrote:

> On 2/15/2021 6:52 AM, Steven White wrote:
> > It looks to me that SolrInputDocument.addField() is either missnamed or
> > isn't well implemented.
> >
> > When it is called on a field that doesn't exist in the schema, it will
> > create that field and give it a type based on the data.  Not only that,
> it
> > will set default values.  For example, this call
> >
> >  SolrInputDocument doc = new SolrInputDocument();
> >  doc.addField("Company", "ACM company");
> >
> > Will create the following:
> >
> >  
> >  
>
> That SolrJ code does not make those changes to your schema.  At least
> not in the way you're thinking.
>
> It sounds to me like your solrconfig.xml includes what we call
> "schemaless mode" -- an update processor that adds unknown fields when
> they are indexed.  You should disable it.  We strongly recommend never
> using it in production, because it can make the wrong guess about which
> fieldType is required.  The fieldType chosen has very little to do with
> the SolrJ code.  It is controlled by what's in solrconfig.xml.
>
> Thanks,
> Shawn
>


Re: SolrJ: SolrInputDocument.addField()

2021-02-15 Thread Shawn Heisey

On 2/15/2021 6:52 AM, Steven White wrote:

It looks to me that SolrInputDocument.addField() is either missnamed or
isn't well implemented.

When it is called on a field that doesn't exist in the schema, it will
create that field and give it a type based on the data.  Not only that, it
will set default values.  For example, this call

 SolrInputDocument doc = new SolrInputDocument();
 doc.addField("Company", "ACM company");

Will create the following:

 
 


That SolrJ code does not make those changes to your schema.  At least 
not in the way you're thinking.


It sounds to me like your solrconfig.xml includes what we call 
"schemaless mode" -- an update processor that adds unknown fields when 
they are indexed.  You should disable it.  We strongly recommend never 
using it in production, because it can make the wrong guess about which 
fieldType is required.  The fieldType chosen has very little to do with 
the SolrJ code.  It is controlled by what's in solrconfig.xml.


Thanks,
Shawn


Re: SolrJ: SolrInputDocument.addField()

2021-02-15 Thread Steven White
Thanks Shawn.

It looks to me that SolrInputDocument.addField() is either missnamed or
isn't well implemented.

When it is called on a field that doesn't exist in the schema, it will
create that field and give it a type based on the data.  Not only that, it
will set default values.  For example, this call

SolrInputDocument doc = new SolrInputDocument();
doc.addField("Company", "ACM company");

Will create the following:




Since this is happening without the caller knowing it (this is not
documented) it leads to search issues (the intended analyzer will not be
used to name one).

It looks to me that the only way I can fix this is to first create the
field type first and then call addField() passing it the field I
just created.  To that end, I cannot find a SolrJ API to create a field
type.  Is that the case?

Steven.


On Sun, Feb 14, 2021 at 7:17 PM Shawn Heisey  wrote:

> On 2/14/2021 9:00 AM, Steven White wrote:
> > It looks like I'm misusing SolrJ API  SolrInputDocument.addField() thus I
> > need clarification.
> >
> > Here is an example of what I have in my code:
> >
> >  SolrInputDocument doc = new SolrInputDocument();
> >  doc.addField("MyFieldOne", "some data");
> >  doc.addField("MyFieldTwo", 100);
> >
> > The above code is creating 2 fields for me (if they don't exist already)
> > and then indexing the data to those fields.  The data is "some data" and
> > the number 100  However, when the field is created, it is not using the
> > field type that I custom created in my schema.  My question is, how do I
> > tell addField() to use my custom field type?
>
> There is no way in SolrJ code to control which fieldType is used.  That
> is controlled solely by the server-side schema definition.
>
> How do you know that Solr is not using the correct fieldType?  If you
> are looking at the documents returned by a search and aren't seeing the
> transformations described in the schema, you're looking in the wrong place.
>
> Solr search results always returns what was originally sent in for
> indexing.  Only Update Processors (defined in solrconfig.xml, not the
> schema) can affect what gets returned in results, fieldType definitions
> NEVER affect data returned in search results.
>
> Thanks,
> Shawn
>


Re: SolrJ: SolrInputDocument.addField()

2021-02-14 Thread Shawn Heisey

On 2/14/2021 9:00 AM, Steven White wrote:

It looks like I'm misusing SolrJ API  SolrInputDocument.addField() thus I
need clarification.

Here is an example of what I have in my code:

 SolrInputDocument doc = new SolrInputDocument();
 doc.addField("MyFieldOne", "some data");
 doc.addField("MyFieldTwo", 100);

The above code is creating 2 fields for me (if they don't exist already)
and then indexing the data to those fields.  The data is "some data" and
the number 100  However, when the field is created, it is not using the
field type that I custom created in my schema.  My question is, how do I
tell addField() to use my custom field type?


There is no way in SolrJ code to control which fieldType is used.  That 
is controlled solely by the server-side schema definition.


How do you know that Solr is not using the correct fieldType?  If you 
are looking at the documents returned by a search and aren't seeing the 
transformations described in the schema, you're looking in the wrong place.


Solr search results always returns what was originally sent in for 
indexing.  Only Update Processors (defined in solrconfig.xml, not the 
schema) can affect what gets returned in results, fieldType definitions 
NEVER affect data returned in search results.


Thanks,
Shawn


SolrJ: SolrInputDocument.addField()

2021-02-14 Thread Steven White
Hi everyone,

It looks like I'm misusing SolrJ API  SolrInputDocument.addField() thus I
need clarification.

Here is an example of what I have in my code:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("MyFieldOne", "some data");
doc.addField("MyFieldTwo", 100);

The above code is creating 2 fields for me (if they don't exist already)
and then indexing the data to those fields.  The data is "some data" and
the number 100  However, when the field is created, it is not using the
field type that I custom created in my schema.  My question is, how do I
tell addField() to use my custom field type?

I _think_ I have to first SolrInputDocument.createField() and then call
SolrInputDocument.addField()?  Or is the process of indexing data into a
field done via some other API I overlooked?

I need some guidance to make sure I get the logic right.

Thanks.

Steven


Re: Change uniqueKey using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi,

SolrJ doesn't have any purpose-made request class to change the
uniqueKey, afaict.  However doing so is still possible (though less
convenient) using the "GenericSolrRequest" class, which can be used to
hit arbitrary Solr APIs.

If you'd like to see better support for this in SolrJ, open a JIRA
ticket with the details of what you're trying to do (or a PR directly)
and I'd be happy to take a look.

Best,

Jason

On Fri, Jan 22, 2021 at 9:29 AM Timo Grün  wrote:
>
> Hi All,
>
> I’m currently trying to change the uniqueKey of my Solr Cloud schema using 
> Solrj.
> While creating new Fields and FieldDefinitions is pretty straight forward, I 
> struggle to find any solution to change the Unique Key field with Solrj.
>
> Any advice here?
>
> Best Regards,
>
> Timo Gruen
>


Re: Getting Solr's statistic using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi Steven,

AFAIK, SolrJ doesn't have built in request objects for the metrics
API.  But you can still use the "GenericSolrRequest" class to hit any
Solr API:

e.g.

SolrParams params = new ModifiableSolrParams();
params.set("action", "list");
GenericSolrRequest request = new
GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics/history",
params);
final SimpleSolrResponse response = request.process(solrClient);

Hope that helps,

Jason

On Fri, Jan 22, 2021 at 11:21 AM Gael Jourdan-Weil
 wrote:
>
> Hello Steven,
>
> I believe what you are looking for cannot be accessed using SolrJ (I didn't 
> really check though).
>
> But you can easily access it either via the Collections APIs and/or the 
> Metrics API depending on what you need exactly.
> See https://lucene.apache.org/solr/guide/8_4/cluster-node-management.html and 
> https://lucene.apache.org/solr/guide/8_4/metrics-reporting.html
>
> Gaël
>
>
> De : Steven White 
> Envoyé : vendredi 22 janvier 2021 16:46
> À : solr-user@lucene.apache.org 
> Objet : Getting Solr's statistic using SolrJ
>
> Hi everyone,
>
> Is there a SolrJ API that I can use to collect statistics data about Solr
> (everything that I see on the dashboard if possible)?
>
> I am in need to collect data about Solr instances, those same data that I
> see on the dashboard such as swap-memory, jvm-memory, list of cores, info
> about each core, etc. etc. using SolrJ API.
>
> Thanks
>
> Steven


RE: Getting Solr's statistic using SolrJ

2021-01-22 Thread Gael Jourdan-Weil
Hello Steven,

I believe what you are looking for cannot be accessed using SolrJ (I didn't 
really check though).

But you can easily access it either via the Collections APIs and/or the Metrics 
API depending on what you need exactly.
See https://lucene.apache.org/solr/guide/8_4/cluster-node-management.html and 
https://lucene.apache.org/solr/guide/8_4/metrics-reporting.html

Gaël


De : Steven White 
Envoyé : vendredi 22 janvier 2021 16:46
À : solr-user@lucene.apache.org 
Objet : Getting Solr's statistic using SolrJ 
 
Hi everyone,

Is there a SolrJ API that I can use to collect statistics data about Solr
(everything that I see on the dashboard if possible)?

I am in need to collect data about Solr instances, those same data that I
see on the dashboard such as swap-memory, jvm-memory, list of cores, info
about each core, etc. etc. using SolrJ API.

Thanks

Steven

Getting Solr's statistic using SolrJ

2021-01-22 Thread Steven White
Hi everyone,

Is there a SolrJ API that I can use to collect statistics data about Solr
(everything that I see on the dashboard if possible)?

I am in need to collect data about Solr instances, those same data that I
see on the dashboard such as swap-memory, jvm-memory, list of cores, info
about each core, etc. etc. using SolrJ API.

Thanks

Steven


Change uniqueKey using SolrJ

2021-01-22 Thread Timo Grün
Hi All,

I’m currently trying to change the uniqueKey of my Solr Cloud schema using 
Solrj.
While creating new Fields and FieldDefinitions is pretty straight forward, I 
struggle to find any solution to change the Unique Key field with Solrj.

Any advice here?

Best Regards,

Timo Gruen



Re: Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-08 Thread Walter Underwood
Years ago, working on the Ultraseek spider, we did a bunch of tests on 
compressed HTTP.
I expected it to be a big win, but the results were really inconclusive. 
Sometimes it was faster,
sometimes it was slower. We left it turned off.

It is an absolute win for serving already-compressed static content with Apache 
or whatever.
For dynamic content, it will increase some amount of delay as stuff is 
compressed before
sending. If the content already fits in one or two packets, it is just extra 
overhead. For really
large data, it helps with transmission time, but the processing time for large 
data probably
overwhelms the network time.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jan 8, 2021, at 12:01 AM, Gael Jourdan-Weil 
>  wrote:
> 
> You're right Matthew.
> 
> Jetty supports it for responses but for requests it doesn't seem to be the 
> default.
> However I found a configuration not documented that needs to be set in the 
> GzipHandler for it to work: inflateBufferSize.
> 
> For SolrJ it still hacky to send gzip requests, maybe easier to use a regular 
> http call..
> 
> ---
> 
> De : matthew sporleder 
> Envoyé : jeudi 7 janvier 2021 16:43
> À : solr-user@lucene.apache.org 
> Objet : Re: Sending compressed (gzip) UpdateRequest with SolrJ 
>  
> jetty supports http gzip and I've added it to solr before in my own
> installs (and submitted patches to do so by default to solr) but I
> don't know about the handling for solrj.
> 
> IME compression helps a little, sometimes a lot, and never hurts.
> Even the admin interface benefits a lot from regular old http gzip
> 
> On Thu, Jan 7, 2021 at 8:03 AM Gael Jourdan-Weil
>  wrote:
>> 
>> Answering to myself on this one.
>> 
>> Solr uses Jetty 9.x which does not support compressed requests by itself 
>> meaning, the application behind Jetty (that is Solr) has to decompress by 
>> itself which is not the case for now.
>> Thus even without using SolrJ, sending XML compressed in GZIP to Solr (with 
>> cURL for instance) is not possible for now.
>> 
>> Seems quite surprising to me though.
>> 
>> -
>> 
>> Hello,
>> 
>> I was wondering if someone ever had the need to send compressed (gzip) 
>> update requests (adding/deleting documents), especially using SolrJ.
>> 
>> Somehow I expected it to be done by default, but didn't find any 
>> documentation about it and when looking at the code it seems there is no 
>> option to do it. Or is javabin compressed by default?
>> - 
>> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
>> - 
>> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
>>  (if not using Javabin)
>> - 
>> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587
>> 
>> By the way, is there any documentation about javabin? I could only find one 
>> on the "old wiki".
>> 
>> Thanks,
>> Gaël



RE: Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-08 Thread Gael Jourdan-Weil
You're right Matthew.

Jetty supports it for responses but for requests it doesn't seem to be the 
default.
However I found a configuration not documented that needs to be set in the 
GzipHandler for it to work: inflateBufferSize.

For SolrJ it still hacky to send gzip requests, maybe easier to use a regular 
http call..

---

De : matthew sporleder 
Envoyé : jeudi 7 janvier 2021 16:43
À : solr-user@lucene.apache.org 
Objet : Re: Sending compressed (gzip) UpdateRequest with SolrJ 
 
jetty supports http gzip and I've added it to solr before in my own
installs (and submitted patches to do so by default to solr) but I
don't know about the handling for solrj.

IME compression helps a little, sometimes a lot, and never hurts.
Even the admin interface benefits a lot from regular old http gzip

On Thu, Jan 7, 2021 at 8:03 AM Gael Jourdan-Weil
 wrote:
>
> Answering to myself on this one.
>
> Solr uses Jetty 9.x which does not support compressed requests by itself 
> meaning, the application behind Jetty (that is Solr) has to decompress by 
> itself which is not the case for now.
> Thus even without using SolrJ, sending XML compressed in GZIP to Solr (with 
> cURL for instance) is not possible for now.
>
> Seems quite surprising to me though.
>
> -
>
> Hello,
>
> I was wondering if someone ever had the need to send compressed (gzip) update 
> requests (adding/deleting documents), especially using SolrJ.
>
> Somehow I expected it to be done by default, but didn't find any 
> documentation about it and when looking at the code it seems there is no 
> option to do it. Or is javabin compressed by default?
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
>  (if not using Javabin)
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587
>
> By the way, is there any documentation about javabin? I could only find one 
> on the "old wiki".
>
> Thanks,
> Gaël

Re: Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-07 Thread matthew sporleder
jetty supports http gzip and I've added it to solr before in my own
installs (and submitted patches to do so by default to solr) but I
don't know about the handling for solrj.

IME compression helps a little, sometimes a lot, and never hurts.
Even the admin interface benefits a lot from regular old http gzip

On Thu, Jan 7, 2021 at 8:03 AM Gael Jourdan-Weil
 wrote:
>
> Answering to myself on this one.
>
> Solr uses Jetty 9.x which does not support compressed requests by itself 
> meaning, the application behind Jetty (that is Solr) has to decompress by 
> itself which is not the case for now.
> Thus even without using SolrJ, sending XML compressed in GZIP to Solr (with 
> cURL for instance) is not possible for now.
>
> Seems quite surprising to me though.
>
> -
>
> Hello,
>
> I was wondering if someone ever had the need to send compressed (gzip) update 
> requests (adding/deleting documents), especially using SolrJ.
>
> Somehow I expected it to be done by default, but didn't find any 
> documentation about it and when looking at the code it seems there is no 
> option to do it. Or is javabin compressed by default?
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
>  (if not using Javabin)
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587
>
> By the way, is there any documentation about javabin? I could only find one 
> on the "old wiki".
>
> Thanks,
> Gaël


RE: Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-07 Thread Gael Jourdan-Weil
Answering to myself on this one.

Solr uses Jetty 9.x which does not support compressed requests by itself 
meaning, the application behind Jetty (that is Solr) has to decompress by 
itself which is not the case for now.
Thus even without using SolrJ, sending XML compressed in GZIP to Solr (with 
cURL for instance) is not possible for now.

Seems quite surprising to me though.

-
 
Hello,

I was wondering if someone ever had the need to send compressed (gzip) update 
requests (adding/deleting documents), especially using SolrJ.

Somehow I expected it to be done by default, but didn't find any documentation 
about it and when looking at the code it seems there is no option to do it. Or 
is javabin compressed by default?
- 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
- 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
 (if not using Javabin)
- 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587

By the way, is there any documentation about javabin? I could only find one on 
the "old wiki".

Thanks,
Gaël

Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-05 Thread Gael Jourdan-Weil
Hello,

I was wondering if someone ever had the need to send compressed (gzip) update 
requests (adding/deleting documents), especially using SolrJ.

Somehow I expected it to be done by default, but didn't find any documentation 
about it and when looking at the code it seems there is no option to do it. Or 
is javabin compressed by default?
- 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
- 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
 (if not using Javabin)
- 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587

By the way, is there any documentation about javabin? I could only find one on 
the "old wiki".

Thanks,
Gaël

Re: Solrj supporting term vector component ?

2020-12-04 Thread Erick Erickson
To expand on Shawn’s comment. There are a lot of built-in helper methods in 
SolrJ, but
they all amount to setting a value in the underlying map of params, which you 
can
do yourself for any parameter you could specify on a URL or cURL command.

For instance, SolrQuery.setStart(start) is just:

this.set(CommonParams.START, start);

and this.set just puts CommonParams.START, start into the underlying parameter 
map.

I’m simplifying some here since the helper methods do some safety checking and
the like, but the take-away is “anything you can set on a URL or specify
in a cURL command can be specified in SolrJ by setting the parameter 
explicitly”.

Best,
Erick

> On Dec 3, 2020, at 1:24 PM, Shawn Heisey  wrote:
> 
> On 12/3/2020 10:20 AM, Deepu wrote:
>> I am planning to use Term vector component for one of the use cases, as per
>> below solr documentation link solrj not supporting Term Vector Component,
>> do you have any other suggestions to use TVC in java application?
>> https://lucene.apache.org/solr/guide/8_4/the-term-vector-component.html#solrj-and-the-term-vector-component
> 
> SolrJ will support just about any query you might care to send, you just have 
> to give it all the required parameters when building the request. All the 
> results will be available, though you'll almost certainly have to provide 
> code yourself that rips apart the NamedList into usable info.
> 
> What is being said in the documentation is that there are not any special 
> objects or methods for doing term vector queries.  It's not saying that it 
> can't be done.
> 
> Thanks,
> Shawn



Re: Solrj supporting term vector component ?

2020-12-03 Thread Shawn Heisey

On 12/3/2020 10:20 AM, Deepu wrote:

I am planning to use Term vector component for one of the use cases, as per
below solr documentation link solrj not supporting Term Vector Component,
do you have any other suggestions to use TVC in java application?

https://lucene.apache.org/solr/guide/8_4/the-term-vector-component.html#solrj-and-the-term-vector-component


SolrJ will support just about any query you might care to send, you just 
have to give it all the required parameters when building the request. 
All the results will be available, though you'll almost certainly have 
to provide code yourself that rips apart the NamedList into usable info.


What is being said in the documentation is that there are not any 
special objects or methods for doing term vector queries.  It's not 
saying that it can't be done.


Thanks,
Shawn


Solrj supporting term vector component ?

2020-12-03 Thread Deepu
Dear Community Members,

I am planning to use Term vector component for one of the use cases, as per
below solr documentation link solrj not supporting Term Vector Component,
do you have any other suggestions to use TVC in java application?

https://lucene.apache.org/solr/guide/8_4/the-term-vector-component.html#solrj-and-the-term-vector-component


Thanks,
Deepu


Re: SolrJ NestableJsonFacet ordering of query facet

2020-11-19 Thread Jason Gerlowski
Hi Shivram,

I think the short answer is "no".  At least, not without sub-classing
some of the JSON-Facet classes in SolrJ.

But it's hard for me to understand your particular concern without
seeing a concrete example.  If you provide an example (maybe in the
form of a JUnit test snippet showing the actual and expected values),
I may be able to provide more help.

Best,

Jason

On Fri, Oct 30, 2020 at 1:54 AM Shivam Jha  wrote:
>
> Hi folks,
>
> Does anyone have any advice on this issue?
>
> Thanks,
> Shivam
>
> On Tue, Oct 27, 2020 at 1:20 PM Shivam Jha  wrote:
>
> > Hi folks,
> >
> > Doing some faceted queries using 'facet.json' param and SolrJ, the results
> > of which I am processing using SolrJ NestableJsonFacet class.
> > basically as   *queryResponse.getJsonFacetingResponse() -> returns 
> > *NestableJsonFacet
> > object.
> >
> > But I have noticed it does not maintain the facet-query order in which it
> > was given in *facet.json.*
> > *Direct queries to solr do maintain that order, but not after it comes to
> > Java layer in SolrJ.*
> >
> > Is there a way to make it maintain that order ?
> > Hopefully the question makes sense, if not please let me know I can
> > clarify further.
> >
> > Thanks,
> > Shivam
> >
>
>
> --
> shivamJha


Re: getFirstValue problem with solrj

2020-11-05 Thread Raivo Rebane

Heloo

I found mistake

Thanks

Raivo

On 05.11.20 17:08, Raivo Rebane wrote:

Hello

I have one problem

I made a query as foollws :

                    System.out.println("Querying by using SolrQuery...");
                    SolrQuery solrQuery = new SolrQuery("Name:" + 
Pattern);

                    solrQuery.addField("id");
                    solrQuery.addField("App_code");
                    solrQuery.setSort("App_code", ORDER.asc);
                    solrQuery.setRows(10);
                QueryResponse response = null;
                try {
                response = 
AppServServlet.solrClientApps.query(solrQuery);

                } catch (SolrServerException | IOException e) {
                System.err.printf("\nFailed to search 
articles: %s", e.getMessage());

                }

and got response as follows :

                SolrDocumentList documents = null;
                    if (response != null) {
                    documents = response.getResults();
                    System.out.printf("Found %d documents\n", 
documents.getNumFound());

                    for (SolrDocument document : documents) {
                    String id = (String) 
document.getFirstValue("id");
                    String name = (String) 
document.getFirstValue("Name");
                    long App_code1 = (long) 
document.getFirstValue("App_code");
                    System.out.printf("id=%s, App_code=%s, 
name=%s\n", id, App_code1, name);

                    }
                 }

Found 1 document but name = (String) document.getFirstValue("Name"); 
returns null


Console looks like :

Querying by using SolrQuery...
Found 1 documents
id=92, App_code=2, name=null

What is wrong ?

Please somebody helps me !

Regards

Raivo

PS. Collection is like :

57 0  HIS
45 1  RAN/AEGIR
92 2  PIT2
*



getFirstValue problem with solrj

2020-11-05 Thread Raivo Rebane

Hello

I have one problem

I made a query as foollws :

                    System.out.println("Querying by using SolrQuery...");
                    SolrQuery solrQuery = new SolrQuery("Name:" + Pattern);
                    solrQuery.addField("id");
                    solrQuery.addField("App_code");
                    solrQuery.setSort("App_code", ORDER.asc);
                    solrQuery.setRows(10);
                QueryResponse response = null;
                try {
                response = 
AppServServlet.solrClientApps.query(solrQuery);

                } catch (SolrServerException | IOException e) {
                System.err.printf("\nFailed to search articles: 
%s", e.getMessage());

                }

and got response as follows :

                SolrDocumentList documents = null;
                    if (response != null) {
                    documents = response.getResults();
                    System.out.printf("Found %d documents\n", 
documents.getNumFound());

                    for (SolrDocument document : documents) {
                    String id = (String) 
document.getFirstValue("id");
                    String name = (String) 
document.getFirstValue("Name");
                    long App_code1 = (long) 
document.getFirstValue("App_code");
                    System.out.printf("id=%s, App_code=%s, 
name=%s\n", id, App_code1, name);

                    }
                 }

Found 1 document but name = (String) document.getFirstValue("Name"); 
returns null


Console looks like :

Querying by using SolrQuery...
Found 1 documents
id=92, App_code=2, name=null

What is wrong ?

Please somebody helps me !

Regards

Raivo

PS. Collection is like :

57 0  HIS
45 1  RAN/AEGIR
92 2  PIT2
*



Re: SolrJ NestableJsonFacet ordering of query facet

2020-10-29 Thread Shivam Jha
Hi folks,

Does anyone have any advice on this issue?

Thanks,
Shivam

On Tue, Oct 27, 2020 at 1:20 PM Shivam Jha  wrote:

> Hi folks,
>
> Doing some faceted queries using 'facet.json' param and SolrJ, the results
> of which I am processing using SolrJ NestableJsonFacet class.
> basically as   *queryResponse.getJsonFacetingResponse() -> returns 
> *NestableJsonFacet
> object.
>
> But I have noticed it does not maintain the facet-query order in which it
> was given in *facet.json.*
> *Direct queries to solr do maintain that order, but not after it comes to
> Java layer in SolrJ.*
>
> Is there a way to make it maintain that order ?
> Hopefully the question makes sense, if not please let me know I can
> clarify further.
>
> Thanks,
> Shivam
>


-- 
shivamJha


SolrJ NestableJsonFacet ordering of query facet

2020-10-27 Thread Shivam Jha
Hi folks,

Doing some faceted queries using 'facet.json' param and SolrJ, the results
of which I am processing using SolrJ NestableJsonFacet class.
basically as   *queryResponse.getJsonFacetingResponse() -> returns
*NestableJsonFacet
object.

But I have noticed it does not maintain the facet-query order in which it
was given in *facet.json.*
*Direct queries to solr do maintain that order, but not after it comes to
Java layer in SolrJ.*

Is there a way to make it maintain that order ?
Hopefully the question makes sense, if not please let me know I can clarify
further.

Thanks,
Shivam


I can cannot execute my SolrJ application

2020-10-21 Thread Raivo Rebane

Hello

I can compile my project properliy;

But problems arise when I execute it with Maven and got problems
I dont know have these solrj problems or maven problems:

Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /home/hydra/workspace1/test/EMBEDDED
Java version: 11.0.8, vendor: Ubuntu, runtime: 
/usr/lib/jvm/java-11-openjdk-amd64

Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "5.4.0-51-generic", arch: "amd64", family: "unix"

[DEBUG] Adding to classpath : 
file:/home/hydra/workspace1/test/target/classes/

[DEBUG] Adding project dependency artifact: solr-solrj to classpath
[DEBUG] Adding project dependency artifact: slf4j-api to classpath
[DEBUG] Adding project dependency artifact: commons-httpclient to classpath
[DEBUG] Adding project dependency artifact: commons-logging to classpath
[DEBUG] Adding project dependency artifact: commons-codec to classpath
[DEBUG] Adding project dependency artifact: commons-io to classpath
[DEBUG] Adding project dependency artifact: commons-fileupload to classpath
[DEBUG] Adding project dependency artifact: wstx-asl to classpath
[DEBUG] Adding project dependency artifact: stax-api to classpath
[DEBUG] Adding project dependency artifact: geronimo-stax-api_1.0_spec 
to classpath

[DEBUG] joining on thread Thread[SolrJExample.main(),5,SolrJExample]
[DEBUG] Setting accessibility to true in order to invoke main().
[INFO] 


[INFO] BUILD FAILURE
[INFO] 


[INFO] Total time:  53.717 s
[INFO] Finished at: 2020-10-21T09:41:08+03:00
[INFO] 

[ERROR] Failed to execute goal 
org.codehaus.mojo:exec-maven-plugin:1.1:java (default-cli) on project 
test: An exception occured while executing the Java class. null: 
InvocationTargetException: Unresolved compilation problem: -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
execute goal org.codehaus.mojo:exec-maven-plugin:1.1:java (default-cli) 
on project test: An exception occured while executing the Java class. null
    at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:215)
    at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:156)
    at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:148)
    at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
    at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
    at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
    at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)

    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute(MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main(MavenCli.java:193)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282)
    at 
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225)
    at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406)
    at 
org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347)
Caused by: org.apache.maven.plugin.MojoExecutionException: An exception 
occured while executing the Java class. null

    at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:345)
    at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
    at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:210)

    ... 20 more
Caused by: java.lang.reflect.InvocationTargetException
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java

SolrJ doesn't find symbol

2020-10-20 Thread Raivo Rebane

Hello

I want to use solrj in Maven project, but I got errors:

[ERROR] 
/home/hydra/workspace1/kaks/src/main/java/solr/SolrJExample.java:[9,36] 
cannot find symbol

[ERROR]   symbol:   class SolrClient
[ERROR]   location: package org.apache.solr.client.solrj
[ERROR] 
/home/hydra/workspace1/kaks/src/main/java/solr/SolrJExample.java:[14,41] 
cannot find symbol

[ERROR]   symbol:   class HttpSolrClient
[ERROR]   location: package org.apache.solr.client.solrj.impl
[ERROR] 
/home/hydra/workspace1/kaks/src/main/java/solr/SolrJExample.java:[32,26] 
cannot find symbol

[ERROR]   symbol:   class SolrClient
[ERROR]   location: class solr.SolrJExample
[ERROR] 
/home/hydra/workspace1/kaks/src/main/java/solr/SolrJExample.java:[32,68] 
package HttpSolrClient does not exist
[ERROR] 
/home/hydra/workspace1/kaks/src/main/java/solr/SolrJExample.java:[133,18] 
cannot find symbol
[ERROR]   symbol:   method 
setSort(java.lang.String,org.apache.solr.client.solrj.SolrQuery.ORDER)
[ERROR]   location: variable solrQuery of type 
org.apache.solr.client.solrj.SolrQuery
[ERROR] 
/home/hydra/workspace1/kaks/src/main/java/solr/SolrJExample.java:[161,18] 
cannot find symbol
[ERROR]   symbol:   method 
setSort(java.lang.String,org.apache.solr.client.solrj.SolrQuery.ORDER)
[ERROR]   location: variable solrQuery of type 
org.apache.solr.client.solrj.SolrQuery


My .classpath looks like:



    path="TOMCAT_HOME/solr/httpclient5-5.0.3.jar"/>
    path="TOMCAT_HOME/solr/solr-solrj-8.6.3.jar"/>
    path="TOMCAT_HOME/solr/commons-codec-1.15.jar"/>
    path="TOMCAT_HOME/solr/commons-httpclient-3.1.jar"/>

    
    path="TOMCAT_HOME/solr/jcl-over-slf4j-1.5.5.jar"/>
    path="src/main/java">

        
            
            
        
    
    path="src/main/resources">

        
            
        
    
    path="src/test/java">

        
            
            
            
        
    
    output="target/test-classes" path="src/test/resources">

        
            
            
        
    
    path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/J2SE-1.5">

        
            
        
    
    path="org.eclipse.m2e.MAVEN2_CLASSPATH_CONTAINER">

        
            
        
    
    


Hope that anybody helps me

I Can't udrestand where is mistake ?

Looking forward

Raivo Rebane

I add my pomxml


http://maven.apache.org/POM/4.0.0;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;>
 
4.0.0
 
com.illucit
illucit-solr
0.0.1-SNAPSHOT
    war
 

1.8
1.8
 

UTF-8
 

3.2
2.4
2.10.1
 

2.6
 

   
   
   solr-solrj
   org.apache.solr
   1.4.0
   jar
   compile

  
 

${project.artifactId}
src
test


resources




testresources




maven-war-plugin

illucit-solr



maven-source-plugin


maven-javadoc-plugin





maven-compiler-plugin
${version.compiler.plugin}

${maven.compiler.source}
${maven.compiler.target}



maven-source-plugin
${version.source.plugin}


attach-sources

jar-no-fork





maven-javadoc-plugin
${version.javadoc.plugin}


attach-javadocs

jar





maven-war-plugin
${version.war.plugin}

false
true
WebContent





 


Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-18 Thread Frank Vos
Good morning all,

Your mail come into my mailbox, please check with your it support why your mail 
come into my mailbox!
If I notice that this is not being done, I will have to take far-reaching 
measures.

Met vriendelijke groet / Kind regards,

Frank Vos
Servicedesk medewerker

T036 760 07 11
M   06 83 33 57 95
Efrank@reddata.nl<mailto:frank@reddata.nl>



Van: Alexandre Rafalovitch 
Datum: donderdag, 17 september 2020 om 21:07
Aan: solr-user 
Onderwerp: Re: NPE Issue with atomic update to nested document or child 
document through SolrJ
The missing underscore is a documentation bug, because it was not
escaped the second time and the asciidoc chewed it up as an
bold/italic indicator. The declaration and references should match.

I am not sure about the code. I hope somebody else will step in on that part.

Regards,
   Alex.

On Thu, 17 Sep 2020 at 14:48, Pratik Patel  wrote:
>
> I am running this in a unit test which deletes the collection after the
> test is over. So every new test run gets a fresh collection.
>
> It is a very simple test where I am first indexing a couple of parent
> documents with few children and then testing an atomic update on one parent
> as I have posted in my previous message. (using UpdateRequest)
>
> I am not sure if I am triggering the atomic update correctly, do you see
> any potential issue in that code?
>
> I noticed something in the documentation here.
> https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents
>
>   name="_nest_path_" type="*nest_path*" />
>
> field_type is declared with name *"_nest_path_"* whereas field is declared
> with type *"nest_path". *
>
> Is this intentional? or should it be as follows?
>
>   name="_nest_path_" type="* _nest_path_ *" />
>
> Also, should we explicitly set index=true and store=true on _nest_path_
> and _nest_parent_ fields?
>
>
>
> On Thu, Sep 17, 2020 at 1:17 PM Alexandre Rafalovitch 
> wrote:
>
> > Did you reindex the original document after you added a new field? If
> > not, then the previously indexed content is missing it and your code
> > paths will get out of sync.
> >
> > Regards,
> >Alex.
> > P.s. I haven't done what you are doing before, so there may be
> > something I am missing myself.
> >
> >
> > On Thu, 17 Sep 2020 at 12:46, Pratik Patel  wrote:
> > >
> > > Thanks for your reply Alexandre.
> > >
> > > I have "_root_" and "_nest_path_" fields in my schema but not
> > > "_nest_parent_".
> > >
> > >
> > > 
> > > 
> > >  > > docValues="false" />
> > > 
> > >  > > name="_nest_path_" class="solr.NestPathField" />
> > >
> > > I ran my test after adding the "_nest_parent_" field and I am not getting
> > > NPE any more which is good. Thanks!
> > >
> > > But looking at the documents in the index, I see that after the atomic
> > > update, now there are two children documents with the same id. One
> > document
> > > has old values and another one has new values. Shouldn't they be merged
> > > based on the "id"? Do we need to specify anything else in the request to
> > > ensure that documents are merged/updated and not duplicated?
> > >
> > > For your reference, below is the test I am running now.
> > >
> > > // update field of one child doc
> > > SolrInputDocument sdoc = new SolrInputDocument(  );
> > > sdoc.addField( "id", testChildPOJO.id() );
> > > sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> > > sdoc.addField( "storeid", "foo" );
> > > sdoc.setField( "fieldName",
> > > java.util.Collections.singletonMap("set", Collections.list("bar" ) ));
> > >
> > > final UpdateRequest req = new UpdateRequest();
> > > req.withRoute( pojo1.id() );// parent id
> > > req.add(sdoc);
> > >
> > > collection.client.request( req,
> > collection.getCollectionName()
> > > );
> > > collection.client.commit();
> > >
> > >
> > > Resulting documents :
> > >
> > > {id=c1_child1, conceptid=c1, storeid=s1,
> > fieldName=c1_child1_field_value1,
> > > startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
> > > booleanField

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-18 Thread Frank Vos


Met vriendelijke groet / Kind regards,

Frank Vos
Servicedesk medewerker

T036 760 07 11
M   06 83 33 57 95
Efrank@reddata.nl<mailto:frank@reddata.nl>



Van: Alexandre Rafalovitch 
Datum: donderdag, 17 september 2020 om 21:07
Aan: solr-user 
Onderwerp: Re: NPE Issue with atomic update to nested document or child 
document through SolrJ
The missing underscore is a documentation bug, because it was not
escaped the second time and the asciidoc chewed it up as an
bold/italic indicator. The declaration and references should match.

I am not sure about the code. I hope somebody else will step in on that part.

Regards,
   Alex.

On Thu, 17 Sep 2020 at 14:48, Pratik Patel  wrote:
>
> I am running this in a unit test which deletes the collection after the
> test is over. So every new test run gets a fresh collection.
>
> It is a very simple test where I am first indexing a couple of parent
> documents with few children and then testing an atomic update on one parent
> as I have posted in my previous message. (using UpdateRequest)
>
> I am not sure if I am triggering the atomic update correctly, do you see
> any potential issue in that code?
>
> I noticed something in the documentation here.
> https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents
>
>   name="_nest_path_" type="*nest_path*" />
>
> field_type is declared with name *"_nest_path_"* whereas field is declared
> with type *"nest_path". *
>
> Is this intentional? or should it be as follows?
>
>   name="_nest_path_" type="* _nest_path_ *" />
>
> Also, should we explicitly set index=true and store=true on _nest_path_
> and _nest_parent_ fields?
>
>
>
> On Thu, Sep 17, 2020 at 1:17 PM Alexandre Rafalovitch 
> wrote:
>
> > Did you reindex the original document after you added a new field? If
> > not, then the previously indexed content is missing it and your code
> > paths will get out of sync.
> >
> > Regards,
> >Alex.
> > P.s. I haven't done what you are doing before, so there may be
> > something I am missing myself.
> >
> >
> > On Thu, 17 Sep 2020 at 12:46, Pratik Patel  wrote:
> > >
> > > Thanks for your reply Alexandre.
> > >
> > > I have "_root_" and "_nest_path_" fields in my schema but not
> > > "_nest_parent_".
> > >
> > >
> > > 
> > > 
> > >  > > docValues="false" />
> > > 
> > >  > > name="_nest_path_" class="solr.NestPathField" />
> > >
> > > I ran my test after adding the "_nest_parent_" field and I am not getting
> > > NPE any more which is good. Thanks!
> > >
> > > But looking at the documents in the index, I see that after the atomic
> > > update, now there are two children documents with the same id. One
> > document
> > > has old values and another one has new values. Shouldn't they be merged
> > > based on the "id"? Do we need to specify anything else in the request to
> > > ensure that documents are merged/updated and not duplicated?
> > >
> > > For your reference, below is the test I am running now.
> > >
> > > // update field of one child doc
> > > SolrInputDocument sdoc = new SolrInputDocument(  );
> > > sdoc.addField( "id", testChildPOJO.id() );
> > > sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> > > sdoc.addField( "storeid", "foo" );
> > > sdoc.setField( "fieldName",
> > > java.util.Collections.singletonMap("set", Collections.list("bar" ) ));
> > >
> > > final UpdateRequest req = new UpdateRequest();
> > > req.withRoute( pojo1.id() );// parent id
> > > req.add(sdoc);
> > >
> > > collection.client.request( req,
> > collection.getCollectionName()
> > > );
> > > collection.client.commit();
> > >
> > >
> > > Resulting documents :
> > >
> > > {id=c1_child1, conceptid=c1, storeid=s1,
> > fieldName=c1_child1_field_value1,
> > > startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
> > > booleanField_bDF=true, _root_=abcd, _version_=1678099970090074112}
> > > {id=c1_child1, conceptid=c1, storeid=foo, fieldName=bar, startTime=Mon
> > Sep
> > > 07 12:40:37 EDT 2020, integerField

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Alexandre Rafalovitch
The missing underscore is a documentation bug, because it was not
escaped the second time and the asciidoc chewed it up as an
bold/italic indicator. The declaration and references should match.

I am not sure about the code. I hope somebody else will step in on that part.

Regards,
   Alex.

On Thu, 17 Sep 2020 at 14:48, Pratik Patel  wrote:
>
> I am running this in a unit test which deletes the collection after the
> test is over. So every new test run gets a fresh collection.
>
> It is a very simple test where I am first indexing a couple of parent
> documents with few children and then testing an atomic update on one parent
> as I have posted in my previous message. (using UpdateRequest)
>
> I am not sure if I am triggering the atomic update correctly, do you see
> any potential issue in that code?
>
> I noticed something in the documentation here.
> https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents
>
>   name="_nest_path_" type="*nest_path*" />
>
> field_type is declared with name *"_nest_path_"* whereas field is declared
> with type *"nest_path". *
>
> Is this intentional? or should it be as follows?
>
>   name="_nest_path_" type="* _nest_path_ *" />
>
> Also, should we explicitly set index=true and store=true on _nest_path_
> and _nest_parent_ fields?
>
>
>
> On Thu, Sep 17, 2020 at 1:17 PM Alexandre Rafalovitch 
> wrote:
>
> > Did you reindex the original document after you added a new field? If
> > not, then the previously indexed content is missing it and your code
> > paths will get out of sync.
> >
> > Regards,
> >Alex.
> > P.s. I haven't done what you are doing before, so there may be
> > something I am missing myself.
> >
> >
> > On Thu, 17 Sep 2020 at 12:46, Pratik Patel  wrote:
> > >
> > > Thanks for your reply Alexandre.
> > >
> > > I have "_root_" and "_nest_path_" fields in my schema but not
> > > "_nest_parent_".
> > >
> > >
> > > 
> > > 
> > >  > > docValues="false" />
> > > 
> > >  > > name="_nest_path_" class="solr.NestPathField" />
> > >
> > > I ran my test after adding the "_nest_parent_" field and I am not getting
> > > NPE any more which is good. Thanks!
> > >
> > > But looking at the documents in the index, I see that after the atomic
> > > update, now there are two children documents with the same id. One
> > document
> > > has old values and another one has new values. Shouldn't they be merged
> > > based on the "id"? Do we need to specify anything else in the request to
> > > ensure that documents are merged/updated and not duplicated?
> > >
> > > For your reference, below is the test I am running now.
> > >
> > > // update field of one child doc
> > > SolrInputDocument sdoc = new SolrInputDocument(  );
> > > sdoc.addField( "id", testChildPOJO.id() );
> > > sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> > > sdoc.addField( "storeid", "foo" );
> > > sdoc.setField( "fieldName",
> > > java.util.Collections.singletonMap("set", Collections.list("bar" ) ));
> > >
> > > final UpdateRequest req = new UpdateRequest();
> > > req.withRoute( pojo1.id() );// parent id
> > > req.add(sdoc);
> > >
> > > collection.client.request( req,
> > collection.getCollectionName()
> > > );
> > > collection.client.commit();
> > >
> > >
> > > Resulting documents :
> > >
> > > {id=c1_child1, conceptid=c1, storeid=s1,
> > fieldName=c1_child1_field_value1,
> > > startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
> > > booleanField_bDF=true, _root_=abcd, _version_=1678099970090074112}
> > > {id=c1_child1, conceptid=c1, storeid=foo, fieldName=bar, startTime=Mon
> > Sep
> > > 07 12:40:37 EDT 2020, integerField_iDF=10, booleanField_bDF=true,
> > > _root_=abcd, _version_=1678099970405695488}
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Sep 17, 2020 at 12:01 PM Alexandre Rafalovitch <
> > arafa...@gmail.com>
> > > wrote:
> > >
> > > > Can you double

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Pratik Patel
I am running this in a unit test which deletes the collection after the
test is over. So every new test run gets a fresh collection.

It is a very simple test where I am first indexing a couple of parent
documents with few children and then testing an atomic update on one parent
as I have posted in my previous message. (using UpdateRequest)

I am not sure if I am triggering the atomic update correctly, do you see
any potential issue in that code?

I noticed something in the documentation here.
https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents

 

field_type is declared with name *"_nest_path_"* whereas field is declared
with type *"nest_path". *

Is this intentional? or should it be as follows?

 

Also, should we explicitly set index=true and store=true on _nest_path_
and _nest_parent_ fields?



On Thu, Sep 17, 2020 at 1:17 PM Alexandre Rafalovitch 
wrote:

> Did you reindex the original document after you added a new field? If
> not, then the previously indexed content is missing it and your code
> paths will get out of sync.
>
> Regards,
>Alex.
> P.s. I haven't done what you are doing before, so there may be
> something I am missing myself.
>
>
> On Thu, 17 Sep 2020 at 12:46, Pratik Patel  wrote:
> >
> > Thanks for your reply Alexandre.
> >
> > I have "_root_" and "_nest_path_" fields in my schema but not
> > "_nest_parent_".
> >
> >
> > 
> > 
> >  > docValues="false" />
> > 
> >  > name="_nest_path_" class="solr.NestPathField" />
> >
> > I ran my test after adding the "_nest_parent_" field and I am not getting
> > NPE any more which is good. Thanks!
> >
> > But looking at the documents in the index, I see that after the atomic
> > update, now there are two children documents with the same id. One
> document
> > has old values and another one has new values. Shouldn't they be merged
> > based on the "id"? Do we need to specify anything else in the request to
> > ensure that documents are merged/updated and not duplicated?
> >
> > For your reference, below is the test I am running now.
> >
> > // update field of one child doc
> > SolrInputDocument sdoc = new SolrInputDocument(  );
> > sdoc.addField( "id", testChildPOJO.id() );
> > sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> > sdoc.addField( "storeid", "foo" );
> > sdoc.setField( "fieldName",
> > java.util.Collections.singletonMap("set", Collections.list("bar" ) ));
> >
> > final UpdateRequest req = new UpdateRequest();
> > req.withRoute( pojo1.id() );// parent id
> > req.add(sdoc);
> >
> > collection.client.request( req,
> collection.getCollectionName()
> > );
> > collection.client.commit();
> >
> >
> > Resulting documents :
> >
> > {id=c1_child1, conceptid=c1, storeid=s1,
> fieldName=c1_child1_field_value1,
> > startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
> > booleanField_bDF=true, _root_=abcd, _version_=1678099970090074112}
> > {id=c1_child1, conceptid=c1, storeid=foo, fieldName=bar, startTime=Mon
> Sep
> > 07 12:40:37 EDT 2020, integerField_iDF=10, booleanField_bDF=true,
> > _root_=abcd, _version_=1678099970405695488}
> >
> >
> >
> >
> >
> >
> > On Thu, Sep 17, 2020 at 12:01 PM Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> > > Can you double-check your schema to see if you have all the fields
> > > required to support nested documents. You are supposed to get away
> > > with just _root_, but really you should also include _nest_path and
> > > _nest_parent_. Your particular exception seems to be triggering
> > > something (maybe a bug) related to - possibly - missing _nest_path_
> > > field.
> > >
> > > See:
> > >
> https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents
> > >
> > > Regards,
> > >Alex.
> > >
> > > On Wed, 16 Sep 2020 at 13:28, Pratik Patel 
> wrote:
> > > >
> > > > Hello Everyone,
> > > >
> > > > I am trying to update a field of a child document using atomic
> updates
> > > > feature. I am using solr and solrJ version 8.5.0
> > > >
> > >

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
Thank you all for your feedback.  They are very helpful.

@Walther, out of the 1000 fields in Solr's schema, only 5 are set as
"required" fields and the Solr doc that I create and then send to Solr for
indexing, contains only those fields that have data to be indexed.  So some
docs will have 10 fields, some 50, etc.

Steven

On Thu, Sep 17, 2020 at 1:55 PM Erick Erickson 
wrote:

> The script can actually be written an any number of scripting languages,
> python, groovy,
> javascript etc. but Alexandre’s comments about javascript are well taken.
>
> It all depends here on whether you every want to search the fields
> individually. If you do,
> you need to have them in your index as well as the copyField.
>
> > On Sep 17, 2020, at 1:37 PM, Walter Underwood 
> wrote:
> >
> > If you want to ignore a field being sent to Solr, you can set
> indexed=false and
> > stored=false for that field in schema.xml. It will take up room in
> schema.xml but
> > zero room on disk.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Sep 17, 2020, at 10:23 AM, Alexandre Rafalovitch 
> wrote:
> >>
> >> Solr has a whole pipeline that you can run during document ingesting
> before
> >> the actual indexing happens. It is called Update Request Processor (URP)
> >> and is defined in solrconfig.xml or in an override file. Obviously,
> since
> >> you are indexing from SolrJ client, you have even more flexibility, but
> it
> >> is good to know about anyway.
> >>
> >> You can read all about it at:
> >> https://lucene.apache.org/solr/guide/8_6/update-request-processors.html
> and
> >> see the extensive list of processors you can leverage. The specific
> >> mentioned one is this one:
> >>
> https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html
> >>
> >> Just a word of warning that Stateless URP is using Javascript, which is
> >> getting a bit of a complicated story as underlying JVM is upgraded
> (Oracle
> >> dropped their javascript engine in JDK 14). So if one of the simpler
> URPs
> >> will do the job or a chain of them, that may be a better path to take.
> >>
> >> Regards,
> >>  Alex.
> >>
> >>
> >> On Thu, 17 Sep 2020 at 13:13, Steven White 
> wrote:
> >>
> >>> Thanks Erick.  Where can I learn more about "stateless script update
> >>> processor factory".  I don't know what you mean by this.
> >>>
> >>> Steven
> >>>
> >>> On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson <
> erickerick...@gmail.com>
> >>> wrote:
> >>>
> >>>> 1000 fields is fine, you'll waste some cycles on bookkeeping, but I
> >>> really
> >>>> doubt you'll notice. That said, are these fields used for searching?
> >>>> Because you do have control over what gous into the index if you can
> put
> >>> a
> >>>> "stateless script update processor factory" in your update chain.
> There
> >>> you
> >>>> can do whatever you want, including combine all the fields into one
> and
> >>>> delete the original fields. There's no point in having your index
> >>> cluttered
> >>>> with unused fields, OTOH, it may not be worth the effort just to
> satisfy
> >>> my
> >>>> sense of aesthetics 
> >>>>
> >>>> On Thu, Sep 17, 2020, 12:59 Steven White 
> wrote:
> >>>>
> >>>>> Hi Eric,
> >>>>>
> >>>>> Yes, this is coming from a DB.  Unfortunately I have no control over
> >>> the
> >>>>> list of fields.  Out of the 1000 fields that there maybe, no
> document,
> >>>> that
> >>>>> gets indexed into Solr will use more then about 50 and since i'm
> >>> copying
> >>>>> the values of those fields to the catch-all field and the catch-all
> >>> field
> >>>>> is my default search field, I don't expect any problem for having
> 1000
> >>>>> fields in Solr's schema, or should I?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Steven
> >>>>>
> >>>>>
> >>>>> On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson <
> >>> erickerick

Re: Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Erick Erickson
I recommend _against_ issuing explicit commits from the client, let
your solrconfig.xml autocommit settings take care of it. Make sure
either your soft or hard commits open a new searcher for the docs
to be searchable.

I’ll bend a little bit if you can _guarantee_ that you only ever have one
indexing client running and basically only ever issue the commit at the
end.

There’s another strategy, do the solrClient.add() command with the
commitWithin parameter.

As far as failures, look at 
https://lucene.apache.org/solr/7_3_0/solr-core/org/apache/solr/update/processor/TolerantUpdateProcessor.html
that’ll give you a better clue about _which_ docs failed. From there, though,
it’s a bit if debugging to figure out why that particular doc failed, usually 
people
record the docs that failed for later analysis. and/or look at the Solr logs 
which
usually give a more detailed reason of _why_ a document failed...

Best,
Erick

> On Sep 17, 2020, at 1:09 PM, Steven White  wrote:
> 
> Hi everyone,
> 
> I'm trying to figure out when and how I should handle failures that may
> occur during indexing.  In the sample code below, look at my comment and
> let me know what state my index is in when things fail:
> 
>   SolrClient solrClient = new HttpSolrClient.Builder(url).build();
> 
>   solrClient.add(solrDocs);
> 
>   // #1: What to do if add() fails?  And how do I know if all or some of
> my docs in 'solrDocs' made it to the index or not ('solrDocs' is a list of
> 1 or more doc), should I retry add() again?  Retry with a smaller chunk?
> Etc.
> 
>   if (doCommit == true)
>   {
>  solrClient.commit();
> 
>   // #2: What to do if commit() fails?  Re-issue commit() again?
>   }
> 
> Thanks
> 
> Steven



Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
The script can actually be written an any number of scripting languages, 
python, groovy,
javascript etc. but Alexandre’s comments about javascript are well taken.

It all depends here on whether you every want to search the fields 
individually. If you do,
you need to have them in your index as well as the copyField.

> On Sep 17, 2020, at 1:37 PM, Walter Underwood  wrote:
> 
> If you want to ignore a field being sent to Solr, you can set indexed=false 
> and 
> stored=false for that field in schema.xml. It will take up room in schema.xml 
> but
> zero room on disk.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Sep 17, 2020, at 10:23 AM, Alexandre Rafalovitch  
>> wrote:
>> 
>> Solr has a whole pipeline that you can run during document ingesting before
>> the actual indexing happens. It is called Update Request Processor (URP)
>> and is defined in solrconfig.xml or in an override file. Obviously, since
>> you are indexing from SolrJ client, you have even more flexibility, but it
>> is good to know about anyway.
>> 
>> You can read all about it at:
>> https://lucene.apache.org/solr/guide/8_6/update-request-processors.html and
>> see the extensive list of processors you can leverage. The specific
>> mentioned one is this one:
>> https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html
>> 
>> Just a word of warning that Stateless URP is using Javascript, which is
>> getting a bit of a complicated story as underlying JVM is upgraded (Oracle
>> dropped their javascript engine in JDK 14). So if one of the simpler URPs
>> will do the job or a chain of them, that may be a better path to take.
>> 
>> Regards,
>>  Alex.
>> 
>> 
>> On Thu, 17 Sep 2020 at 13:13, Steven White  wrote:
>> 
>>> Thanks Erick.  Where can I learn more about "stateless script update
>>> processor factory".  I don't know what you mean by this.
>>> 
>>> Steven
>>> 
>>> On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson 
>>> wrote:
>>> 
>>>> 1000 fields is fine, you'll waste some cycles on bookkeeping, but I
>>> really
>>>> doubt you'll notice. That said, are these fields used for searching?
>>>> Because you do have control over what gous into the index if you can put
>>> a
>>>> "stateless script update processor factory" in your update chain. There
>>> you
>>>> can do whatever you want, including combine all the fields into one and
>>>> delete the original fields. There's no point in having your index
>>> cluttered
>>>> with unused fields, OTOH, it may not be worth the effort just to satisfy
>>> my
>>>> sense of aesthetics 
>>>> 
>>>> On Thu, Sep 17, 2020, 12:59 Steven White  wrote:
>>>> 
>>>>> Hi Eric,
>>>>> 
>>>>> Yes, this is coming from a DB.  Unfortunately I have no control over
>>> the
>>>>> list of fields.  Out of the 1000 fields that there maybe, no document,
>>>> that
>>>>> gets indexed into Solr will use more then about 50 and since i'm
>>> copying
>>>>> the values of those fields to the catch-all field and the catch-all
>>> field
>>>>> is my default search field, I don't expect any problem for having 1000
>>>>> fields in Solr's schema, or should I?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Steven
>>>>> 
>>>>> 
>>>>> On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson <
>>> erickerick...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> “there over 1000 of them[fields]”
>>>>>> 
>>>>>> This is often a red flag in my experience. Solr will handle that many
>>>>>> fields, I’ve seen many more. But this is often a result of
>>>>>> “database thinking”, i.e. your mental model of how all this data
>>>>>> is from a DB perspective rather than a search perspective.
>>>>>> 
>>>>>> It’s unwieldy to have that many fields. Obviously I don’t know the
>>>>>> particulars of
>>>>>> your app, and maybe that’s the best design. Particularly if many of
>>> the
>>>>>> fields
>>>>>> are sparsely populated, i.e. only a small percentage of the documents
>>>> in
>>

Re: Doing what does using SolrJ API

2020-09-17 Thread Walter Underwood
If you want to ignore a field being sent to Solr, you can set indexed=false and 
stored=false for that field in schema.xml. It will take up room in schema.xml 
but
zero room on disk.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Sep 17, 2020, at 10:23 AM, Alexandre Rafalovitch  
> wrote:
> 
> Solr has a whole pipeline that you can run during document ingesting before
> the actual indexing happens. It is called Update Request Processor (URP)
> and is defined in solrconfig.xml or in an override file. Obviously, since
> you are indexing from SolrJ client, you have even more flexibility, but it
> is good to know about anyway.
> 
> You can read all about it at:
> https://lucene.apache.org/solr/guide/8_6/update-request-processors.html and
> see the extensive list of processors you can leverage. The specific
> mentioned one is this one:
> https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html
> 
> Just a word of warning that Stateless URP is using Javascript, which is
> getting a bit of a complicated story as underlying JVM is upgraded (Oracle
> dropped their javascript engine in JDK 14). So if one of the simpler URPs
> will do the job or a chain of them, that may be a better path to take.
> 
> Regards,
>   Alex.
> 
> 
> On Thu, 17 Sep 2020 at 13:13, Steven White  wrote:
> 
>> Thanks Erick.  Where can I learn more about "stateless script update
>> processor factory".  I don't know what you mean by this.
>> 
>> Steven
>> 
>> On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson 
>> wrote:
>> 
>>> 1000 fields is fine, you'll waste some cycles on bookkeeping, but I
>> really
>>> doubt you'll notice. That said, are these fields used for searching?
>>> Because you do have control over what gous into the index if you can put
>> a
>>> "stateless script update processor factory" in your update chain. There
>> you
>>> can do whatever you want, including combine all the fields into one and
>>> delete the original fields. There's no point in having your index
>> cluttered
>>> with unused fields, OTOH, it may not be worth the effort just to satisfy
>> my
>>> sense of aesthetics 
>>> 
>>> On Thu, Sep 17, 2020, 12:59 Steven White  wrote:
>>> 
>>>> Hi Eric,
>>>> 
>>>> Yes, this is coming from a DB.  Unfortunately I have no control over
>> the
>>>> list of fields.  Out of the 1000 fields that there maybe, no document,
>>> that
>>>> gets indexed into Solr will use more then about 50 and since i'm
>> copying
>>>> the values of those fields to the catch-all field and the catch-all
>> field
>>>> is my default search field, I don't expect any problem for having 1000
>>>> fields in Solr's schema, or should I?
>>>> 
>>>> Thanks
>>>> 
>>>> Steven
>>>> 
>>>> 
>>>> On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson <
>> erickerick...@gmail.com>
>>>> wrote:
>>>> 
>>>>> “there over 1000 of them[fields]”
>>>>> 
>>>>> This is often a red flag in my experience. Solr will handle that many
>>>>> fields, I’ve seen many more. But this is often a result of
>>>>> “database thinking”, i.e. your mental model of how all this data
>>>>> is from a DB perspective rather than a search perspective.
>>>>> 
>>>>> It’s unwieldy to have that many fields. Obviously I don’t know the
>>>>> particulars of
>>>>> your app, and maybe that’s the best design. Particularly if many of
>> the
>>>>> fields
>>>>> are sparsely populated, i.e. only a small percentage of the documents
>>> in
>>>>> your
>>>>> corpus have any value for that field then taking a step back and
>>> looking
>>>>> at the design might save you some grief down the line.
>>>>> 
>>>>> For instance, I’ve seen designs where instead of
>>>>> field1:some_value
>>>>> field2:other_value….
>>>>> 
>>>>> you use a single field with _tokens_ like:
>>>>> field:field1_some_value
>>>>> field:field2_other_value
>>>>> 
>>>>> that drops the complexity and increases performance.
>>>>> 
>>>>> Anyway, just a thought you might want to consider.
>>>>> 
>>>>> Best,
>>>>> Erick
>>>>> 
>>>>>> On Sep 16, 2020, at 9:31 PM, Steven White 
>>>> wrote:
>>>>>> 
>>>>>> Hi everyone,
>>>>>> 
>>>>>> I figured it out.  It is as simple as creating a List and
>>> using
>>>>>> that as the value part for SolrInputDocument.addField() API.
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Steven
>>>>>> 
>>>>>> 
>>>>>> On Wed, Sep 16, 2020 at 9:13 PM Steven White >> 
>>>>> wrote:
>>>>>> 
>>>>>>> Hi everyone,
>>>>>>> 
>>>>>>> I want to avoid creating a >>>>>> source="OneFieldOfMany"/> in my schema (there will be over 1000 of
>>>> them
>>>>> and
>>>>>>> maybe more so managing it will be a pain).  Instead, I want to use
>>>> SolrJ
>>>>>>> API to do what  does.  Any example of how I can do
>> this?
>>>> If
>>>>>>> there is an example online, that would be great.
>>>>>>> 
>>>>>>> Thanks in advance.
>>>>>>> 
>>>>>>> Steven
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 



Re: Doing what does using SolrJ API

2020-09-17 Thread Alexandre Rafalovitch
Solr has a whole pipeline that you can run during document ingesting before
the actual indexing happens. It is called Update Request Processor (URP)
and is defined in solrconfig.xml or in an override file. Obviously, since
you are indexing from SolrJ client, you have even more flexibility, but it
is good to know about anyway.

You can read all about it at:
https://lucene.apache.org/solr/guide/8_6/update-request-processors.html and
see the extensive list of processors you can leverage. The specific
mentioned one is this one:
https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html

Just a word of warning that Stateless URP is using Javascript, which is
getting a bit of a complicated story as underlying JVM is upgraded (Oracle
dropped their javascript engine in JDK 14). So if one of the simpler URPs
will do the job or a chain of them, that may be a better path to take.

Regards,
   Alex.


On Thu, 17 Sep 2020 at 13:13, Steven White  wrote:

> Thanks Erick.  Where can I learn more about "stateless script update
> processor factory".  I don't know what you mean by this.
>
> Steven
>
> On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson 
> wrote:
>
> > 1000 fields is fine, you'll waste some cycles on bookkeeping, but I
> really
> > doubt you'll notice. That said, are these fields used for searching?
> > Because you do have control over what gous into the index if you can put
> a
> > "stateless script update processor factory" in your update chain. There
> you
> > can do whatever you want, including combine all the fields into one and
> > delete the original fields. There's no point in having your index
> cluttered
> > with unused fields, OTOH, it may not be worth the effort just to satisfy
> my
> > sense of aesthetics 
> >
> > On Thu, Sep 17, 2020, 12:59 Steven White  wrote:
> >
> > > Hi Eric,
> > >
> > > Yes, this is coming from a DB.  Unfortunately I have no control over
> the
> > > list of fields.  Out of the 1000 fields that there maybe, no document,
> > that
> > > gets indexed into Solr will use more then about 50 and since i'm
> copying
> > > the values of those fields to the catch-all field and the catch-all
> field
> > > is my default search field, I don't expect any problem for having 1000
> > > fields in Solr's schema, or should I?
> > >
> > > Thanks
> > >
> > > Steven
> > >
> > >
> > > On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson <
> erickerick...@gmail.com>
> > > wrote:
> > >
> > > > “there over 1000 of them[fields]”
> > > >
> > > > This is often a red flag in my experience. Solr will handle that many
> > > > fields, I’ve seen many more. But this is often a result of
> > > > “database thinking”, i.e. your mental model of how all this data
> > > > is from a DB perspective rather than a search perspective.
> > > >
> > > > It’s unwieldy to have that many fields. Obviously I don’t know the
> > > > particulars of
> > > > your app, and maybe that’s the best design. Particularly if many of
> the
> > > > fields
> > > > are sparsely populated, i.e. only a small percentage of the documents
> > in
> > > > your
> > > > corpus have any value for that field then taking a step back and
> > looking
> > > > at the design might save you some grief down the line.
> > > >
> > > > For instance, I’ve seen designs where instead of
> > > > field1:some_value
> > > > field2:other_value….
> > > >
> > > > you use a single field with _tokens_ like:
> > > > field:field1_some_value
> > > > field:field2_other_value
> > > >
> > > > that drops the complexity and increases performance.
> > > >
> > > > Anyway, just a thought you might want to consider.
> > > >
> > > > Best,
> > > > Erick
> > > >
> > > > > On Sep 16, 2020, at 9:31 PM, Steven White 
> > > wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > I figured it out.  It is as simple as creating a List and
> > using
> > > > > that as the value part for SolrInputDocument.addField() API.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Steven
> > > > >
> > > > >
> > > > > On Wed, Sep 16, 2020 at 9:13 PM Steven White  >
> > > > wrote:
> > > > >
> > > > >> Hi everyone,
> > > > >>
> > > > >> I want to avoid creating a  > > > >> source="OneFieldOfMany"/> in my schema (there will be over 1000 of
> > > them
> > > > and
> > > > >> maybe more so managing it will be a pain).  Instead, I want to use
> > > SolrJ
> > > > >> API to do what  does.  Any example of how I can do
> this?
> > > If
> > > > >> there is an example online, that would be great.
> > > > >>
> > > > >> Thanks in advance.
> > > > >>
> > > > >> Steven
> > > > >>
> > > >
> > > >
> > >
> >
>


Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Alexandre Rafalovitch
Did you reindex the original document after you added a new field? If
not, then the previously indexed content is missing it and your code
paths will get out of sync.

Regards,
   Alex.
P.s. I haven't done what you are doing before, so there may be
something I am missing myself.


On Thu, 17 Sep 2020 at 12:46, Pratik Patel  wrote:
>
> Thanks for your reply Alexandre.
>
> I have "_root_" and "_nest_path_" fields in my schema but not
> "_nest_parent_".
>
>
> 
> 
>  docValues="false" />
> 
>  name="_nest_path_" class="solr.NestPathField" />
>
> I ran my test after adding the "_nest_parent_" field and I am not getting
> NPE any more which is good. Thanks!
>
> But looking at the documents in the index, I see that after the atomic
> update, now there are two children documents with the same id. One document
> has old values and another one has new values. Shouldn't they be merged
> based on the "id"? Do we need to specify anything else in the request to
> ensure that documents are merged/updated and not duplicated?
>
> For your reference, below is the test I am running now.
>
> // update field of one child doc
> SolrInputDocument sdoc = new SolrInputDocument(  );
> sdoc.addField( "id", testChildPOJO.id() );
> sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> sdoc.addField( "storeid", "foo" );
> sdoc.setField( "fieldName",
> java.util.Collections.singletonMap("set", Collections.list("bar" ) ));
>
> final UpdateRequest req = new UpdateRequest();
> req.withRoute( pojo1.id() );// parent id
> req.add(sdoc);
>
> collection.client.request( req, collection.getCollectionName()
> );
> collection.client.commit();
>
>
> Resulting documents :
>
> {id=c1_child1, conceptid=c1, storeid=s1, fieldName=c1_child1_field_value1,
> startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
> booleanField_bDF=true, _root_=abcd, _version_=1678099970090074112}
> {id=c1_child1, conceptid=c1, storeid=foo, fieldName=bar, startTime=Mon Sep
> 07 12:40:37 EDT 2020, integerField_iDF=10, booleanField_bDF=true,
> _root_=abcd, _version_=1678099970405695488}
>
>
>
>
>
>
> On Thu, Sep 17, 2020 at 12:01 PM Alexandre Rafalovitch 
> wrote:
>
> > Can you double-check your schema to see if you have all the fields
> > required to support nested documents. You are supposed to get away
> > with just _root_, but really you should also include _nest_path and
> > _nest_parent_. Your particular exception seems to be triggering
> > something (maybe a bug) related to - possibly - missing _nest_path_
> > field.
> >
> > See:
> > https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents
> >
> > Regards,
> >Alex.
> >
> > On Wed, 16 Sep 2020 at 13:28, Pratik Patel  wrote:
> > >
> > > Hello Everyone,
> > >
> > > I am trying to update a field of a child document using atomic updates
> > > feature. I am using solr and solrJ version 8.5.0
> > >
> > > I have ensured that my schema satisfies the conditions for atomic updates
> > > and I am able to do atomic updates on normal documents but with nested
> > > child documents, I am getting a Null Pointer Exception. Following is the
> > > simple test which I am trying.
> > >
> > > TestPojo  pojo1  = new TestPojo().cId( "abcd" )
> > > >  .conceptid( "c1" )
> > > >  .storeid( storeId )
> > > >  .testChildPojos(
> > > > Collections.list( testChildPOJO, testChildPOJO2,
> > > >
> > testChildPOJO3 )
> > > > );
> > > > TestChildPOJOtestChildPOJO = new TestChildPOJO().cId(
> > > > "c1_child1" )
> > > >   .conceptid( "c1"
> > )
> > > >   .storeid(
> > storeId )
> > > >   .fieldName(
> > > > "c1_child1_field_value1" )
> > > >   .startTime(
> > > > Date.from( now.minus( 10, ChronoUnit.DAYS ) ) )
> > > >
> >  .inte

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
Thanks Erick.  Where can I learn more about "stateless script update
processor factory".  I don't know what you mean by this.

Steven

On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson 
wrote:

> 1000 fields is fine, you'll waste some cycles on bookkeeping, but I really
> doubt you'll notice. That said, are these fields used for searching?
> Because you do have control over what gous into the index if you can put a
> "stateless script update processor factory" in your update chain. There you
> can do whatever you want, including combine all the fields into one and
> delete the original fields. There's no point in having your index cluttered
> with unused fields, OTOH, it may not be worth the effort just to satisfy my
> sense of aesthetics 
>
> On Thu, Sep 17, 2020, 12:59 Steven White  wrote:
>
> > Hi Eric,
> >
> > Yes, this is coming from a DB.  Unfortunately I have no control over the
> > list of fields.  Out of the 1000 fields that there maybe, no document,
> that
> > gets indexed into Solr will use more then about 50 and since i'm copying
> > the values of those fields to the catch-all field and the catch-all field
> > is my default search field, I don't expect any problem for having 1000
> > fields in Solr's schema, or should I?
> >
> > Thanks
> >
> > Steven
> >
> >
> > On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson 
> > wrote:
> >
> > > “there over 1000 of them[fields]”
> > >
> > > This is often a red flag in my experience. Solr will handle that many
> > > fields, I’ve seen many more. But this is often a result of
> > > “database thinking”, i.e. your mental model of how all this data
> > > is from a DB perspective rather than a search perspective.
> > >
> > > It’s unwieldy to have that many fields. Obviously I don’t know the
> > > particulars of
> > > your app, and maybe that’s the best design. Particularly if many of the
> > > fields
> > > are sparsely populated, i.e. only a small percentage of the documents
> in
> > > your
> > > corpus have any value for that field then taking a step back and
> looking
> > > at the design might save you some grief down the line.
> > >
> > > For instance, I’ve seen designs where instead of
> > > field1:some_value
> > > field2:other_value….
> > >
> > > you use a single field with _tokens_ like:
> > > field:field1_some_value
> > > field:field2_other_value
> > >
> > > that drops the complexity and increases performance.
> > >
> > > Anyway, just a thought you might want to consider.
> > >
> > > Best,
> > > Erick
> > >
> > > > On Sep 16, 2020, at 9:31 PM, Steven White 
> > wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > I figured it out.  It is as simple as creating a List and
> using
> > > > that as the value part for SolrInputDocument.addField() API.
> > > >
> > > > Thanks,
> > > >
> > > > Steven
> > > >
> > > >
> > > > On Wed, Sep 16, 2020 at 9:13 PM Steven White 
> > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> I want to avoid creating a  > > >> source="OneFieldOfMany"/> in my schema (there will be over 1000 of
> > them
> > > and
> > > >> maybe more so managing it will be a pain).  Instead, I want to use
> > SolrJ
> > > >> API to do what  does.  Any example of how I can do this?
> > If
> > > >> there is an example online, that would be great.
> > > >>
> > > >> Thanks in advance.
> > > >>
> > > >> Steven
> > > >>
> > >
> > >
> >
>


Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Steven White
Hi everyone,

I'm trying to figure out when and how I should handle failures that may
occur during indexing.  In the sample code below, look at my comment and
let me know what state my index is in when things fail:

   SolrClient solrClient = new HttpSolrClient.Builder(url).build();

   solrClient.add(solrDocs);

   // #1: What to do if add() fails?  And how do I know if all or some of
my docs in 'solrDocs' made it to the index or not ('solrDocs' is a list of
1 or more doc), should I retry add() again?  Retry with a smaller chunk?
Etc.

   if (doCommit == true)
   {
  solrClient.commit();

   // #2: What to do if commit() fails?  Re-issue commit() again?
   }

Thanks

Steven


Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
1000 fields is fine, you'll waste some cycles on bookkeeping, but I really
doubt you'll notice. That said, are these fields used for searching?
Because you do have control over what gous into the index if you can put a
"stateless script update processor factory" in your update chain. There you
can do whatever you want, including combine all the fields into one and
delete the original fields. There's no point in having your index cluttered
with unused fields, OTOH, it may not be worth the effort just to satisfy my
sense of aesthetics 

On Thu, Sep 17, 2020, 12:59 Steven White  wrote:

> Hi Eric,
>
> Yes, this is coming from a DB.  Unfortunately I have no control over the
> list of fields.  Out of the 1000 fields that there maybe, no document, that
> gets indexed into Solr will use more then about 50 and since i'm copying
> the values of those fields to the catch-all field and the catch-all field
> is my default search field, I don't expect any problem for having 1000
> fields in Solr's schema, or should I?
>
> Thanks
>
> Steven
>
>
> On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson 
> wrote:
>
> > “there over 1000 of them[fields]”
> >
> > This is often a red flag in my experience. Solr will handle that many
> > fields, I’ve seen many more. But this is often a result of
> > “database thinking”, i.e. your mental model of how all this data
> > is from a DB perspective rather than a search perspective.
> >
> > It’s unwieldy to have that many fields. Obviously I don’t know the
> > particulars of
> > your app, and maybe that’s the best design. Particularly if many of the
> > fields
> > are sparsely populated, i.e. only a small percentage of the documents in
> > your
> > corpus have any value for that field then taking a step back and looking
> > at the design might save you some grief down the line.
> >
> > For instance, I’ve seen designs where instead of
> > field1:some_value
> > field2:other_value….
> >
> > you use a single field with _tokens_ like:
> > field:field1_some_value
> > field:field2_other_value
> >
> > that drops the complexity and increases performance.
> >
> > Anyway, just a thought you might want to consider.
> >
> > Best,
> > Erick
> >
> > > On Sep 16, 2020, at 9:31 PM, Steven White 
> wrote:
> > >
> > > Hi everyone,
> > >
> > > I figured it out.  It is as simple as creating a List and using
> > > that as the value part for SolrInputDocument.addField() API.
> > >
> > > Thanks,
> > >
> > > Steven
> > >
> > >
> > > On Wed, Sep 16, 2020 at 9:13 PM Steven White 
> > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> I want to avoid creating a  > >> source="OneFieldOfMany"/> in my schema (there will be over 1000 of
> them
> > and
> > >> maybe more so managing it will be a pain).  Instead, I want to use
> SolrJ
> > >> API to do what  does.  Any example of how I can do this?
> If
> > >> there is an example online, that would be great.
> > >>
> > >> Thanks in advance.
> > >>
> > >> Steven
> > >>
> >
> >
>


Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
Hi Eric,

Yes, this is coming from a DB.  Unfortunately I have no control over the
list of fields.  Out of the 1000 fields that there maybe, no document, that
gets indexed into Solr will use more then about 50 and since i'm copying
the values of those fields to the catch-all field and the catch-all field
is my default search field, I don't expect any problem for having 1000
fields in Solr's schema, or should I?

Thanks

Steven


On Thu, Sep 17, 2020 at 8:23 AM Erick Erickson 
wrote:

> “there over 1000 of them[fields]”
>
> This is often a red flag in my experience. Solr will handle that many
> fields, I’ve seen many more. But this is often a result of
> “database thinking”, i.e. your mental model of how all this data
> is from a DB perspective rather than a search perspective.
>
> It’s unwieldy to have that many fields. Obviously I don’t know the
> particulars of
> your app, and maybe that’s the best design. Particularly if many of the
> fields
> are sparsely populated, i.e. only a small percentage of the documents in
> your
> corpus have any value for that field then taking a step back and looking
> at the design might save you some grief down the line.
>
> For instance, I’ve seen designs where instead of
> field1:some_value
> field2:other_value….
>
> you use a single field with _tokens_ like:
> field:field1_some_value
> field:field2_other_value
>
> that drops the complexity and increases performance.
>
> Anyway, just a thought you might want to consider.
>
> Best,
> Erick
>
> > On Sep 16, 2020, at 9:31 PM, Steven White  wrote:
> >
> > Hi everyone,
> >
> > I figured it out.  It is as simple as creating a List and using
> > that as the value part for SolrInputDocument.addField() API.
> >
> > Thanks,
> >
> > Steven
> >
> >
> > On Wed, Sep 16, 2020 at 9:13 PM Steven White 
> wrote:
> >
> >> Hi everyone,
> >>
> >> I want to avoid creating a  >> source="OneFieldOfMany"/> in my schema (there will be over 1000 of them
> and
> >> maybe more so managing it will be a pain).  Instead, I want to use SolrJ
> >> API to do what  does.  Any example of how I can do this?  If
> >> there is an example online, that would be great.
> >>
> >> Thanks in advance.
> >>
> >> Steven
> >>
>
>


Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Pratik Patel
Thanks for your reply Alexandre.

I have "_root_" and "_nest_path_" fields in my schema but not
"_nest_parent_".








I ran my test after adding the "_nest_parent_" field and I am not getting
NPE any more which is good. Thanks!

But looking at the documents in the index, I see that after the atomic
update, now there are two children documents with the same id. One document
has old values and another one has new values. Shouldn't they be merged
based on the "id"? Do we need to specify anything else in the request to
ensure that documents are merged/updated and not duplicated?

For your reference, below is the test I am running now.

// update field of one child doc
SolrInputDocument sdoc = new SolrInputDocument(  );
sdoc.addField( "id", testChildPOJO.id() );
sdoc.addField( "conceptid", testChildPOJO.conceptid() );
sdoc.addField( "storeid", "foo" );
sdoc.setField( "fieldName",
java.util.Collections.singletonMap("set", Collections.list("bar" ) ));

final UpdateRequest req = new UpdateRequest();
req.withRoute( pojo1.id() );// parent id
req.add(sdoc);

collection.client.request( req, collection.getCollectionName()
);
collection.client.commit();


Resulting documents :

{id=c1_child1, conceptid=c1, storeid=s1, fieldName=c1_child1_field_value1,
startTime=Mon Sep 07 12:40:37 EDT 2020, integerField_iDF=10,
booleanField_bDF=true, _root_=abcd, _version_=1678099970090074112}
{id=c1_child1, conceptid=c1, storeid=foo, fieldName=bar, startTime=Mon Sep
07 12:40:37 EDT 2020, integerField_iDF=10, booleanField_bDF=true,
_root_=abcd, _version_=1678099970405695488}






On Thu, Sep 17, 2020 at 12:01 PM Alexandre Rafalovitch 
wrote:

> Can you double-check your schema to see if you have all the fields
> required to support nested documents. You are supposed to get away
> with just _root_, but really you should also include _nest_path and
> _nest_parent_. Your particular exception seems to be triggering
> something (maybe a bug) related to - possibly - missing _nest_path_
> field.
>
> See:
> https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents
>
> Regards,
>Alex.
>
> On Wed, 16 Sep 2020 at 13:28, Pratik Patel  wrote:
> >
> > Hello Everyone,
> >
> > I am trying to update a field of a child document using atomic updates
> > feature. I am using solr and solrJ version 8.5.0
> >
> > I have ensured that my schema satisfies the conditions for atomic updates
> > and I am able to do atomic updates on normal documents but with nested
> > child documents, I am getting a Null Pointer Exception. Following is the
> > simple test which I am trying.
> >
> > TestPojo  pojo1  = new TestPojo().cId( "abcd" )
> > >  .conceptid( "c1" )
> > >  .storeid( storeId )
> > >  .testChildPojos(
> > > Collections.list( testChildPOJO, testChildPOJO2,
> > >
> testChildPOJO3 )
> > > );
> > > TestChildPOJOtestChildPOJO = new TestChildPOJO().cId(
> > > "c1_child1" )
> > >   .conceptid( "c1"
> )
> > >   .storeid(
> storeId )
> > >   .fieldName(
> > > "c1_child1_field_value1" )
> > >   .startTime(
> > > Date.from( now.minus( 10, ChronoUnit.DAYS ) ) )
> > >
>  .integerField_iDF(
> > > 10 )
> > >
> > > .booleanField_bDF(true);
> > > // index pojo1 with child testChildPOJO
> > > SolrInputDocument sdoc = new SolrInputDocument();
> > > sdoc.addField( "_route_", pojo1.cId() );
> > > sdoc.addField( "id", testChildPOJO.cId() );
> > > sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> > > sdoc.addField( "storeid", testChildPOJO.cId() );
> > > sdoc.setField( "fieldName", java.util.Collections.singletonMap("set",
> > > Collections.list(testChildPOJO.fieldName() + postfix) ) ); // modify
> field
> > > "fieldName"
> > > collection.client.add( sdoc );   // results in NPE!
> >
> >
> > Stack Trace:
> >
> > ERROR org.apache.solr.client.solrj.impl.BaseCloud

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Alexandre Rafalovitch
Can you double-check your schema to see if you have all the fields
required to support nested documents. You are supposed to get away
with just _root_, but really you should also include _nest_path and
_nest_parent_. Your particular exception seems to be triggering
something (maybe a bug) related to - possibly - missing _nest_path_
field.

See: 
https://lucene.apache.org/solr/guide/8_5/indexing-nested-documents.html#indexing-nested-documents

Regards,
   Alex.

On Wed, 16 Sep 2020 at 13:28, Pratik Patel  wrote:
>
> Hello Everyone,
>
> I am trying to update a field of a child document using atomic updates
> feature. I am using solr and solrJ version 8.5.0
>
> I have ensured that my schema satisfies the conditions for atomic updates
> and I am able to do atomic updates on normal documents but with nested
> child documents, I am getting a Null Pointer Exception. Following is the
> simple test which I am trying.
>
> TestPojo  pojo1  = new TestPojo().cId( "abcd" )
> >  .conceptid( "c1" )
> >  .storeid( storeId )
> >  .testChildPojos(
> > Collections.list( testChildPOJO, testChildPOJO2,
> >  testChildPOJO3 )
> > );
> > TestChildPOJOtestChildPOJO = new TestChildPOJO().cId(
> > "c1_child1" )
> >   .conceptid( "c1" )
> >   .storeid( storeId )
> >   .fieldName(
> > "c1_child1_field_value1" )
> >   .startTime(
> > Date.from( now.minus( 10, ChronoUnit.DAYS ) ) )
> >   .integerField_iDF(
> > 10 )
> >
> > .booleanField_bDF(true);
> > // index pojo1 with child testChildPOJO
> > SolrInputDocument sdoc = new SolrInputDocument();
> > sdoc.addField( "_route_", pojo1.cId() );
> > sdoc.addField( "id", testChildPOJO.cId() );
> > sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> > sdoc.addField( "storeid", testChildPOJO.cId() );
> > sdoc.setField( "fieldName", java.util.Collections.singletonMap("set",
> > Collections.list(testChildPOJO.fieldName() + postfix) ) ); // modify field
> > "fieldName"
> > collection.client.add( sdoc );   // results in NPE!
>
>
> Stack Trace:
>
> ERROR org.apache.solr.client.solrj.impl.BaseCloudSolrClient - Request to
> > collection [collectionTest2] failed due to (500)
> > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> > from server at
> > http://172.15.1.100:8081/solr/collectionTest2_shard1_replica_n1:
> > java.lang.NullPointerException
> > at
> > org.apache.solr.update.processor.AtomicUpdateDocumentMerger.getFieldFromHierarchy(AtomicUpdateDocumentMerger.java:308)
> > at
> > org.apache.solr.update.processor.AtomicUpdateDocumentMerger.mergeChildDoc(AtomicUpdateDocumentMerger.java:405)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:711)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:374)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:339)
> > at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:339)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:225)
> > at
> > org.apache.solr.update.processor.DistributedZkUpdateProcessor.processAdd(DistributedZkUpdateProcessor.java:245)
> > at
> > org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > at
> > org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
> > at
> > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:332)
> > at
> > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readIterator(JavaBinUpdateRequestCodec.java:281)
> > at
> > org.apache

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread pratik@semandex
Following are the approaches I have tried so far and both results in NPE.



*approach 1

TestChildPOJO  testChildPOJO = new TestChildPOJO().cId( "c1_child1" )
  .conceptid( "c1" )
  .storeid( storeId )
  .fieldName(
"c1_child1_field_value1" )
  .startTime( Date.from(
now.minus( 10, ChronoUnit.DAYS ) ) )
  .integerField_iDF( 10
)
 
.booleanField_bDF(true);


TestPojo  pojo1  = new TestPojo().cId( "abcd" )
 .conceptid( "c1" )
 .storeid( storeId )
 .testChildPojos(
Collections.list( testChildPOJO, testChildPOJO2, testChildPOJO3 ) );

 

// index pojo1 with child testChildPOJO

SolrInputDocument sdoc = new SolrInputDocument();
sdoc.addField( "_route_", pojo1.cId() );
sdoc.addField( "id", testChildPOJO.cId() );
sdoc.addField( "conceptid", testChildPOJO.conceptid() );
sdoc.addField( "storeid", testChildPOJO.cId() );
sdoc.setField( "fieldName", java.util.Collections.singletonMap("set",
Collections.list(testChildPOJO.fieldName() + postfix) ) );  // modify field
"fieldName"

collection.client.add( sdoc );  
// results in NPE!

*approach 1


*approach 2

SolrInputDocument sdoc = new SolrInputDocument(  );
sdoc.addField( "id", testChildPOJO.id() );
sdoc.setField( "fieldName",
java.util.Collections.singletonMap("set", testChildPOJO.fieldName() +
postfix) );
final UpdateRequest req = new UpdateRequest();
req.withRoute( pojo1.id() );
req.add(sdoc);
   
collection.client.request( req, collection.getCollectionName()
);
req.commit( collection.client, collection.getCollectionName());


*approach 2




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
“there over 1000 of them[fields]”

This is often a red flag in my experience. Solr will handle that many 
fields, I’ve seen many more. But this is often a result of 
“database thinking”, i.e. your mental model of how all this data
is from a DB perspective rather than a search perspective.

It’s unwieldy to have that many fields. Obviously I don’t know the particulars 
of
your app, and maybe that’s the best design. Particularly if many of the fields
are sparsely populated, i.e. only a small percentage of the documents in your
corpus have any value for that field then taking a step back and looking
at the design might save you some grief down the line.

For instance, I’ve seen designs where instead of
field1:some_value
field2:other_value….

you use a single field with _tokens_ like:
field:field1_some_value
field:field2_other_value

that drops the complexity and increases performance.

Anyway, just a thought you might want to consider.

Best,
Erick

> On Sep 16, 2020, at 9:31 PM, Steven White  wrote:
> 
> Hi everyone,
> 
> I figured it out.  It is as simple as creating a List and using
> that as the value part for SolrInputDocument.addField() API.
> 
> Thanks,
> 
> Steven
> 
> 
> On Wed, Sep 16, 2020 at 9:13 PM Steven White  wrote:
> 
>> Hi everyone,
>> 
>> I want to avoid creating a > source="OneFieldOfMany"/> in my schema (there will be over 1000 of them and
>> maybe more so managing it will be a pain).  Instead, I want to use SolrJ
>> API to do what  does.  Any example of how I can do this?  If
>> there is an example online, that would be great.
>> 
>> Thanks in advance.
>> 
>> Steven
>> 



Re: Doing what does using SolrJ API

2020-09-16 Thread Steven White
Hi everyone,

I figured it out.  It is as simple as creating a List and using
that as the value part for SolrInputDocument.addField() API.

Thanks,

Steven


On Wed, Sep 16, 2020 at 9:13 PM Steven White  wrote:

> Hi everyone,
>
> I want to avoid creating a  source="OneFieldOfMany"/> in my schema (there will be over 1000 of them and
> maybe more so managing it will be a pain).  Instead, I want to use SolrJ
> API to do what  does.  Any example of how I can do this?  If
> there is an example online, that would be great.
>
> Thanks in advance.
>
> Steven
>


Doing what does using SolrJ API

2020-09-16 Thread Steven White
Hi everyone,

I want to avoid creating a  in my schema (there will be over 1000 of them and
maybe more so managing it will be a pain).  Instead, I want to use SolrJ
API to do what  does.  Any example of how I can do this?  If
there is an example online, that would be great.

Thanks in advance.

Steven


Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-16 Thread Pratik Patel
Looking at some other unit tests in repo, I tried an approach using
UpdateRequest as follows.

SolrInputDocument sdoc = new SolrInputDocument(  );
> sdoc.addField( "id", testChildPOJO.id() );
> sdoc.setField( "fieldName",
> java.util.Collections.singletonMap("set", testChildPOJO.fieldName() +
> postfix) );
> final UpdateRequest req = new UpdateRequest();
> req.withRoute( pojo1.id() );
> req.add(sdoc);
>
> collection.client.request( req, collection.getCollectionName()
> );
> req.commit( collection.client, collection.getCollectionName());


But this also results in the SAME Null Pointer Exception.

Looking at the source code, it looks like "fieldPath" is null below.



>  AtomicUpdateDocumentMerger.getFieldFromHierarchy(SolrInputDocument
> completeHierarchy, String fieldPath) {
> final List docPaths =
> StrUtils.splitSmart(fieldPath.substring(1), '/');
> ..
>}


Any idea what's wrong here?

Thanks

On Wed, Sep 16, 2020 at 1:27 PM Pratik Patel  wrote:

> Hello Everyone,
>
> I am trying to update a field of a child document using atomic updates
> feature. I am using solr and solrJ version 8.5.0
>
> I have ensured that my schema satisfies the conditions for atomic updates
> and I am able to do atomic updates on normal documents but with nested
> child documents, I am getting a Null Pointer Exception. Following is the
> simple test which I am trying.
>
> TestPojo  pojo1  = new TestPojo().cId( "abcd" )
>>  .conceptid( "c1" )
>>  .storeid( storeId )
>>  .testChildPojos(
>> Collections.list( testChildPOJO, testChildPOJO2,
>>  testChildPOJO3 )
>> );
>> TestChildPOJOtestChildPOJO = new TestChildPOJO().cId(
>> "c1_child1" )
>>   .conceptid( "c1" )
>>   .storeid( storeId )
>>   .fieldName(
>> "c1_child1_field_value1" )
>>   .startTime(
>> Date.from( now.minus( 10, ChronoUnit.DAYS ) ) )
>>   .integerField_iDF(
>> 10 )
>>
>> .booleanField_bDF(true);
>> // index pojo1 with child testChildPOJO
>> SolrInputDocument sdoc = new SolrInputDocument();
>> sdoc.addField( "_route_", pojo1.cId() );
>> sdoc.addField( "id", testChildPOJO.cId() );
>> sdoc.addField( "conceptid", testChildPOJO.conceptid() );
>> sdoc.addField( "storeid", testChildPOJO.cId() );
>> sdoc.setField( "fieldName", java.util.Collections.singletonMap("set",
>> Collections.list(testChildPOJO.fieldName() + postfix) ) ); // modify field
>> "fieldName"
>> collection.client.add( sdoc );   // results in NPE!
>
>
> Stack Trace:
>
> ERROR org.apache.solr.client.solrj.impl.BaseCloudSolrClient - Request to
>> collection [collectionTest2] failed due to (500)
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
>> from server at
>> http://172.15.1.100:8081/solr/collectionTest2_shard1_replica_n1:
>> java.lang.NullPointerException
>> at
>> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.getFieldFromHierarchy(AtomicUpdateDocumentMerger.java:308)
>> at
>> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.mergeChildDoc(AtomicUpdateDocumentMerger.java:405)
>> at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:711)
>> at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:374)
>> at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:339)
>> at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
>> at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:339)
>> at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:225)
>> at
>> org.apache.solr.update.processor.DistributedZkUpdateProc

NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-16 Thread Pratik Patel
Hello Everyone,

I am trying to update a field of a child document using atomic updates
feature. I am using solr and solrJ version 8.5.0

I have ensured that my schema satisfies the conditions for atomic updates
and I am able to do atomic updates on normal documents but with nested
child documents, I am getting a Null Pointer Exception. Following is the
simple test which I am trying.

TestPojo  pojo1  = new TestPojo().cId( "abcd" )
>  .conceptid( "c1" )
>  .storeid( storeId )
>  .testChildPojos(
> Collections.list( testChildPOJO, testChildPOJO2,
>  testChildPOJO3 )
> );
> TestChildPOJOtestChildPOJO = new TestChildPOJO().cId(
> "c1_child1" )
>   .conceptid( "c1" )
>   .storeid( storeId )
>   .fieldName(
> "c1_child1_field_value1" )
>   .startTime(
> Date.from( now.minus( 10, ChronoUnit.DAYS ) ) )
>   .integerField_iDF(
> 10 )
>
> .booleanField_bDF(true);
> // index pojo1 with child testChildPOJO
> SolrInputDocument sdoc = new SolrInputDocument();
> sdoc.addField( "_route_", pojo1.cId() );
> sdoc.addField( "id", testChildPOJO.cId() );
> sdoc.addField( "conceptid", testChildPOJO.conceptid() );
> sdoc.addField( "storeid", testChildPOJO.cId() );
> sdoc.setField( "fieldName", java.util.Collections.singletonMap("set",
> Collections.list(testChildPOJO.fieldName() + postfix) ) ); // modify field
> "fieldName"
> collection.client.add( sdoc );   // results in NPE!


Stack Trace:

ERROR org.apache.solr.client.solrj.impl.BaseCloudSolrClient - Request to
> collection [collectionTest2] failed due to (500)
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at
> http://172.15.1.100:8081/solr/collectionTest2_shard1_replica_n1:
> java.lang.NullPointerException
> at
> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.getFieldFromHierarchy(AtomicUpdateDocumentMerger.java:308)
> at
> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.mergeChildDoc(AtomicUpdateDocumentMerger.java:405)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:711)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:374)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:339)
> at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:339)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:225)
> at
> org.apache.solr.update.processor.DistributedZkUpdateProcessor.processAdd(DistributedZkUpdateProcessor.java:245)
> at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> at
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
> at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:332)
> at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readIterator(JavaBinUpdateRequestCodec.java:281)
> at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:338)
> at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:283)
> at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readNamedList(JavaBinUpdateRequestCodec.java:236)
> at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:303)
> at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:283)
> at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:196)
> at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:127)
> at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:122)
> at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:70)
> at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler

SolrJ and AWS help!

2020-09-15 Thread Dhara Patel
Hi,

I’m a student working on a personal project that uses SolrJ to search a 
database using a solr core that lives on an AWS instance. Currently, I am using 
EmbeddedSolrServer() to initialize a Solr core.

CoreContainer.Initializer initializer = new CoreContainer.Initializer();
CoreContainer coreContainer = initializer.initialize();
solr = new EmbeddedSolrServer(coreContainer, "test-vpn");

solr_active = true; //successfully connected to solr core on aws

I would love your input on whether or not this is the correct method for this 
particular implementation.

Thanks!
Dhara


Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10



Re: Solr 8.6.1: Can't round-trip nested document from SolrJ

2020-08-24 Thread Alexandre Rafalovitch
I guess this gets into the point of whether "children" or whatever
field is used for child documents actually needs to be in the schema.
Schemaless mode creates one, but that's not a defining factor. Because
if it needs to be in the schema, then the code should reflect its
cardinality. But if it does not, then all bets are off.

Regards,
   Alex.
P.s. I added this question to SOLR-12298, as I don't think I know
enough about this part of Solr to judge.

On Mon, 24 Aug 2020 at 02:28, Munendra S N  wrote:
>
> >
> > Interestingly, I was forced to add children as an array even when the
> > child was alone and the field was already marked multivalued. It seems
> > the code does not do conversation to multi-value type, which means the
> > query code has to be a lot more careful about checking field return
> > type and having multi-path handling. That's not what Solr does for
> > string class (tested). Is that a known issue?
> >
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89
>
> Not sure about this. Maybe we might need to check in Dev list or Slack
>
>  If I switch commented/uncommented lines around, the retrieval will fail
> > part way through, because one 'children' field is returned as array, but
> > not the other one:
>
> This might be because of these checks
> https://github.com/apache/lucene-solr/blob/e1392c74400d74366982ccb796063ffdcef08047/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java#L201-L209
> but
> not sure
>
> Regards,
> Munendra S N
>
>
>
> On Sun, Aug 23, 2020 at 7:53 PM Alexandre Rafalovitch 
> wrote:
>
> > Thank you Nunedra,
> >
> > That was very helpful. I am looking forward to that documentation Jira
> > to be merged into the next release.
> >
> > I was able to get the example working by switching away from anonymous
> > children to the field approach. Which means hasChildren() call also
> > did not work. It seems the addChildren/hasChildren will need a
> > different schema, without _nest_path_ defined. I did not test.
> >
> > Interestingly, I was forced to add children as an array even when the
> > child was alone and the field was already marked multivalued. It seems
> > the code does not do conversation to multi-value type, which means the
> > query code has to be a lot more careful about checking field return
> > type and having multi-path handling. That's not what Solr does for
> > string class (tested). Is that a known issue?
> >
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89
> >
> > If I switch commented/uncommented lines around, the retrieval will
> > fail part way through, because one 'children' field is returned as
> > array, but not the other one:
> >
> > {responseHeader={status=0,QTime=0,params={q=id:p1,fl=*,[child],wt=javabin,version=2}},response={numFound=1,numFoundExact=true,start=0,docs=[SolrDocument{id=p1,
> > name=[parent1], class=[foo.bar.parent1.1, foo.bar.parent1.2],
> > _version_=1675826293154775040, children=[SolrDocument{id=c1,
> > name=[child1], class=[foo.bar.child1], _version_=1675826293154775040,
> > children=SolrDocument{id=gc1, name=[grandChild1],
> > class=[foo.bar.grandchild1], _version_=1675826293154775040}},
> > SolrDocument{id=c2, name=[child2], class=[foo.bar.child2],
> > _version_=1675826293154775040}]}]}}
> >
> > Regards,
> >Alex.
> >
> > On Sun, 23 Aug 2020 at 01:38, Munendra S N 
> > wrote:
> > >
> > > Hi Alex,
> > >
> > > Currently, Fixing the documentation for nested docs is under progress.
> > More
> > > context is available in this JIRA -
> > > https://issues.apache.org/jira/browse/SOLR-14383.
> > >
> > >
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java
> > >
> > > The child doc transformer needs to be specified as part of the fl
> > parameter
> > > like fl=*,[child] so that the descendants are returned for each matching
> > > doc. As the query q=* matches all the documents, they are returned. If
> > only
> > > parent doc needs to be returned with descendants then, we should either
> > use
> > > block join query or query clause which matches only parent doc.
> > >
> > > Another thing I noticed in the code is that the child docs are indexed as
> > > anonymous docs (similar to old syntax) instead of indexing them in the
> > new
> > > syntax. With this, the nested block will be indexed but since the schema
> > > has _nested_path

Re: Solr 8.6.1: Can't round-trip nested document from SolrJ

2020-08-24 Thread Munendra S N
>
> Interestingly, I was forced to add children as an array even when the
> child was alone and the field was already marked multivalued. It seems
> the code does not do conversation to multi-value type, which means the
> query code has to be a lot more careful about checking field return
> type and having multi-path handling. That's not what Solr does for
> string class (tested). Is that a known issue?
>
> https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89

Not sure about this. Maybe we might need to check in Dev list or Slack

 If I switch commented/uncommented lines around, the retrieval will fail
> part way through, because one 'children' field is returned as array, but
> not the other one:

This might be because of these checks
https://github.com/apache/lucene-solr/blob/e1392c74400d74366982ccb796063ffdcef08047/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java#L201-L209
but
not sure

Regards,
Munendra S N



On Sun, Aug 23, 2020 at 7:53 PM Alexandre Rafalovitch 
wrote:

> Thank you Nunedra,
>
> That was very helpful. I am looking forward to that documentation Jira
> to be merged into the next release.
>
> I was able to get the example working by switching away from anonymous
> children to the field approach. Which means hasChildren() call also
> did not work. It seems the addChildren/hasChildren will need a
> different schema, without _nest_path_ defined. I did not test.
>
> Interestingly, I was forced to add children as an array even when the
> child was alone and the field was already marked multivalued. It seems
> the code does not do conversation to multi-value type, which means the
> query code has to be a lot more careful about checking field return
> type and having multi-path handling. That's not what Solr does for
> string class (tested). Is that a known issue?
>
> https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89
>
> If I switch commented/uncommented lines around, the retrieval will
> fail part way through, because one 'children' field is returned as
> array, but not the other one:
>
> {responseHeader={status=0,QTime=0,params={q=id:p1,fl=*,[child],wt=javabin,version=2}},response={numFound=1,numFoundExact=true,start=0,docs=[SolrDocument{id=p1,
> name=[parent1], class=[foo.bar.parent1.1, foo.bar.parent1.2],
> _version_=1675826293154775040, children=[SolrDocument{id=c1,
> name=[child1], class=[foo.bar.child1], _version_=1675826293154775040,
> children=SolrDocument{id=gc1, name=[grandChild1],
> class=[foo.bar.grandchild1], _version_=1675826293154775040}},
> SolrDocument{id=c2, name=[child2], class=[foo.bar.child2],
> _version_=1675826293154775040}]}]}}
>
> Regards,
>Alex.
>
> On Sun, 23 Aug 2020 at 01:38, Munendra S N 
> wrote:
> >
> > Hi Alex,
> >
> > Currently, Fixing the documentation for nested docs is under progress.
> More
> > context is available in this JIRA -
> > https://issues.apache.org/jira/browse/SOLR-14383.
> >
> >
> https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java
> >
> > The child doc transformer needs to be specified as part of the fl
> parameter
> > like fl=*,[child] so that the descendants are returned for each matching
> > doc. As the query q=* matches all the documents, they are returned. If
> only
> > parent doc needs to be returned with descendants then, we should either
> use
> > block join query or query clause which matches only parent doc.
> >
> > Another thing I noticed in the code is that the child docs are indexed as
> > anonymous docs (similar to old syntax) instead of indexing them in the
> new
> > syntax. With this, the nested block will be indexed but since the schema
> > has _nested_path_ defined [child] doc transformer won't return any docs.
> > Anonymous child docs need parentFilter but specifying parentFilter with
> > _nested_path_ will lead to error
> > It is due to this check -
> >
> https://github.com/apache/lucene-solr/blob/1c8f4c988a07b08f83d85e27e59b43eed5e2ca2a/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java#L104
> >
> > Instead of indexing the docs this way,
> >
> > > SolrInputDocument parent1 = new SolrInputDocument();
> > > parent1.addField("id", "p1");
> > > parent1.addField("name", "parent1");
> > > parent1.addField("class", "foo.bar.parent1");
> > >
> > > SolrInputDocument child1 = new SolrInputDocument();
> > >
> > > parent1.addChildDocument(child1);
> > > child1.addField("id", "c1");

Re: Solr 8.6.1: Can't round-trip nested document from SolrJ

2020-08-23 Thread Alexandre Rafalovitch
Thank you Nunedra,

That was very helpful. I am looking forward to that documentation Jira
to be merged into the next release.

I was able to get the example working by switching away from anonymous
children to the field approach. Which means hasChildren() call also
did not work. It seems the addChildren/hasChildren will need a
different schema, without _nest_path_ defined. I did not test.

Interestingly, I was forced to add children as an array even when the
child was alone and the field was already marked multivalued. It seems
the code does not do conversation to multi-value type, which means the
query code has to be a lot more careful about checking field return
type and having multi-path handling. That's not what Solr does for
string class (tested). Is that a known issue?
https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89

If I switch commented/uncommented lines around, the retrieval will
fail part way through, because one 'children' field is returned as
array, but not the other one:
{responseHeader={status=0,QTime=0,params={q=id:p1,fl=*,[child],wt=javabin,version=2}},response={numFound=1,numFoundExact=true,start=0,docs=[SolrDocument{id=p1,
name=[parent1], class=[foo.bar.parent1.1, foo.bar.parent1.2],
_version_=1675826293154775040, children=[SolrDocument{id=c1,
name=[child1], class=[foo.bar.child1], _version_=1675826293154775040,
children=SolrDocument{id=gc1, name=[grandChild1],
class=[foo.bar.grandchild1], _version_=1675826293154775040}},
SolrDocument{id=c2, name=[child2], class=[foo.bar.child2],
_version_=1675826293154775040}]}]}}

Regards,
   Alex.

On Sun, 23 Aug 2020 at 01:38, Munendra S N  wrote:
>
> Hi Alex,
>
> Currently, Fixing the documentation for nested docs is under progress. More
> context is available in this JIRA -
> https://issues.apache.org/jira/browse/SOLR-14383.
>
> https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java
>
> The child doc transformer needs to be specified as part of the fl parameter
> like fl=*,[child] so that the descendants are returned for each matching
> doc. As the query q=* matches all the documents, they are returned. If only
> parent doc needs to be returned with descendants then, we should either use
> block join query or query clause which matches only parent doc.
>
> Another thing I noticed in the code is that the child docs are indexed as
> anonymous docs (similar to old syntax) instead of indexing them in the new
> syntax. With this, the nested block will be indexed but since the schema
> has _nested_path_ defined [child] doc transformer won't return any docs.
> Anonymous child docs need parentFilter but specifying parentFilter with
> _nested_path_ will lead to error
> It is due to this check -
> https://github.com/apache/lucene-solr/blob/1c8f4c988a07b08f83d85e27e59b43eed5e2ca2a/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java#L104
>
> Instead of indexing the docs this way,
>
> > SolrInputDocument parent1 = new SolrInputDocument();
> > parent1.addField("id", "p1");
> > parent1.addField("name", "parent1");
> > parent1.addField("class", "foo.bar.parent1");
> >
> > SolrInputDocument child1 = new SolrInputDocument();
> >
> > parent1.addChildDocument(child1);
> > child1.addField("id", "c1");
> > child1.addField("name", "child1");
> > child1.addField("class", "foo.bar.child1");
> >
> >
> modify it to indexing
>
> > SolrInputDocument parent1 = new SolrInputDocument();
> > parent1.addField("id", "p1");
> > parent1.addField("name", "parent1");
> > parent1.addField("class", "foo.bar.parent1");
> >
> > SolrInputDocument child1 = new SolrInputDocument();
> >
> > parent1.addField("sometag", Arrays.asList(child1));
> > child1.addField("id", "c1");
> > child1.addField("name", "child1");
> > child1.addField("class", "foo.bar.child1");
> >
> > I think, once the documentation fixes get merged to master, indexing and
> searching with the nested documents will become much clearer.
>
> Regards,
> Munendra S N
>
>
>
> On Sun, Aug 23, 2020 at 5:18 AM Alexandre Rafalovitch 
> wrote:
>
> > Hello,
> >
> > I am trying to get up to date with both SolrJ and Nested Document
> > implementation and not sure where I am failing with a basic test
> > (
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java
> > ).
> >
> > I am using Solr 8.6.1 with a core created with bin/solr create -c

Re: Solr 8.6.1: Can't round-trip nested document from SolrJ

2020-08-22 Thread Munendra S N
Hi Alex,

Currently, Fixing the documentation for nested docs is under progress. More
context is available in this JIRA -
https://issues.apache.org/jira/browse/SOLR-14383.

https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java

The child doc transformer needs to be specified as part of the fl parameter
like fl=*,[child] so that the descendants are returned for each matching
doc. As the query q=* matches all the documents, they are returned. If only
parent doc needs to be returned with descendants then, we should either use
block join query or query clause which matches only parent doc.

Another thing I noticed in the code is that the child docs are indexed as
anonymous docs (similar to old syntax) instead of indexing them in the new
syntax. With this, the nested block will be indexed but since the schema
has _nested_path_ defined [child] doc transformer won't return any docs.
Anonymous child docs need parentFilter but specifying parentFilter with
_nested_path_ will lead to error
It is due to this check -
https://github.com/apache/lucene-solr/blob/1c8f4c988a07b08f83d85e27e59b43eed5e2ca2a/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java#L104

Instead of indexing the docs this way,

> SolrInputDocument parent1 = new SolrInputDocument();
> parent1.addField("id", "p1");
> parent1.addField("name", "parent1");
> parent1.addField("class", "foo.bar.parent1");
>
> SolrInputDocument child1 = new SolrInputDocument();
>
> parent1.addChildDocument(child1);
> child1.addField("id", "c1");
> child1.addField("name", "child1");
> child1.addField("class", "foo.bar.child1");
>
>
modify it to indexing

> SolrInputDocument parent1 = new SolrInputDocument();
> parent1.addField("id", "p1");
> parent1.addField("name", "parent1");
> parent1.addField("class", "foo.bar.parent1");
>
> SolrInputDocument child1 = new SolrInputDocument();
>
> parent1.addField("sometag", Arrays.asList(child1));
> child1.addField("id", "c1");
> child1.addField("name", "child1");
> child1.addField("class", "foo.bar.child1");
>
> I think, once the documentation fixes get merged to master, indexing and
searching with the nested documents will become much clearer.

Regards,
Munendra S N



On Sun, Aug 23, 2020 at 5:18 AM Alexandre Rafalovitch 
wrote:

> Hello,
>
> I am trying to get up to date with both SolrJ and Nested Document
> implementation and not sure where I am failing with a basic test
> (
> https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java
> ).
>
> I am using Solr 8.6.1 with a core created with bin/solr create -c
> solrj (schemaless is still on).
>
> I then index a nested parent/child/grandchild document in and then
> query it back. Looking at debug it seems to go out fine as a nested
> doc but come back as a 3 individual ones.
>
> Output is:
> SolrInputDocument(fields: [id=p1, name=parent1,
> class=foo.bar.parent1], children: [SolrInputDocument(fields: [id=c1,
> name=child1, class=foo.bar.child1], children:
> [SolrInputDocument(fields: [id=gc1, name=grandChild1,
> class=foo.bar.grandchild1])])])
>
> {responseHeader={status=0,QTime=1,params={q=*,wt=javabin,version=2}},response={numFound=3,numFoundExact=true,start=0,docs=[SolrDocument{id=gc1,
> name=[grandChild1], class=[foo.bar.grandchild1],
> _version_=1675769219435724800}, SolrDocument{id=c1, name=[child1],
> class=[foo.bar.child1], _version_=1675769219435724800},
> SolrDocument{id=p1, name=[parent1], class=[foo.bar.parent1],
> _version_=1675769219435724800}]}}
> Found 3 documents
>
> Field: 'id' => 'gc1'
> Field: 'name' => '[grandChild1]'
> Field: 'class' => '[foo.bar.grandchild1]'
> Field: '_version_' => '1675769219435724800'
> Children: false
>
> Field: 'id' => 'c1'
> Field: 'name' => '[child1]'
> Field: 'class' => '[foo.bar.child1]'
> Field: '_version_' => '1675769219435724800'
> Children: false
>
> Field: 'id' => 'p1'
> Field: 'name' => '[parent1]'
> Field: 'class' => '[foo.bar.parent1]'
> Field: '_version_' => '1675769219435724800'
> Children: false
>
> Looking in Admin UI:
> * _root_ element is there and has 3 instances of 'p1' value
> * _nest_path_ (of type _nest_path_ !?!) is also there but is not populated
> * _nest_parent_ is not there
>
> I am not quite sure what that means and what other scheme modification
> (to the _default_) I need to do to get it to work.
>
> I also tried to reproduce the example in the documentation (e.g.
> https://lucene.apache.org/solr/guide/8_6/indexing-nested-documents.html
> and
> https://lucene.apache.org/solr/guide/8_6/searching-nested-documents.html#searching-nested-documents
> )
> but both seem to also want some undiscussed schema (e.g. with ID field
> instead of id) and fail to execute against default schema.
>
> I am kind of stuck. Anybody has a working SolrJ/Nested example or
> ideas of what I missed.
>
> Regards,
>Alex.
>


Solr 8.6.1: Can't round-trip nested document from SolrJ

2020-08-22 Thread Alexandre Rafalovitch
Hello,

I am trying to get up to date with both SolrJ and Nested Document
implementation and not sure where I am failing with a basic test
(https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java).

I am using Solr 8.6.1 with a core created with bin/solr create -c
solrj (schemaless is still on).

I then index a nested parent/child/grandchild document in and then
query it back. Looking at debug it seems to go out fine as a nested
doc but come back as a 3 individual ones.

Output is:
SolrInputDocument(fields: [id=p1, name=parent1,
class=foo.bar.parent1], children: [SolrInputDocument(fields: [id=c1,
name=child1, class=foo.bar.child1], children:
[SolrInputDocument(fields: [id=gc1, name=grandChild1,
class=foo.bar.grandchild1])])])
{responseHeader={status=0,QTime=1,params={q=*,wt=javabin,version=2}},response={numFound=3,numFoundExact=true,start=0,docs=[SolrDocument{id=gc1,
name=[grandChild1], class=[foo.bar.grandchild1],
_version_=1675769219435724800}, SolrDocument{id=c1, name=[child1],
class=[foo.bar.child1], _version_=1675769219435724800},
SolrDocument{id=p1, name=[parent1], class=[foo.bar.parent1],
_version_=1675769219435724800}]}}
Found 3 documents

Field: 'id' => 'gc1'
Field: 'name' => '[grandChild1]'
Field: 'class' => '[foo.bar.grandchild1]'
Field: '_version_' => '1675769219435724800'
Children: false

Field: 'id' => 'c1'
Field: 'name' => '[child1]'
Field: 'class' => '[foo.bar.child1]'
Field: '_version_' => '1675769219435724800'
Children: false

Field: 'id' => 'p1'
Field: 'name' => '[parent1]'
Field: 'class' => '[foo.bar.parent1]'
Field: '_version_' => '1675769219435724800'
Children: false

Looking in Admin UI:
* _root_ element is there and has 3 instances of 'p1' value
* _nest_path_ (of type _nest_path_ !?!) is also there but is not populated
* _nest_parent_ is not there

I am not quite sure what that means and what other scheme modification
(to the _default_) I need to do to get it to work.

I also tried to reproduce the example in the documentation (e.g.
https://lucene.apache.org/solr/guide/8_6/indexing-nested-documents.html
and  
https://lucene.apache.org/solr/guide/8_6/searching-nested-documents.html#searching-nested-documents)
but both seem to also want some undiscussed schema (e.g. with ID field
instead of id) and fail to execute against default schema.

I am kind of stuck. Anybody has a working SolrJ/Nested example or
ideas of what I missed.

Regards,
   Alex.


Re: Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Chris Hostetter

: Hmm, setting -Dfile.encoding=UTF-8 solves the problem. I have to now check
: which component of the application screws it up, but at the moment I do NOT
: believe it is related to Solrj.

You can use the "forbidden-apis" project to analyze your code and look for 
uses of APIs that depend on the default file encoding, locale, charset, 
etc...

https://github.com/policeman-tools/forbidden-apis

...this project started as an offshoot of build rules in 
Lucene/Solr, precisely to help detect problems like the one you 
are facing -- and it's used to analyze all Solr code, which is why i'm 
pretty confident that no SolrJ code is mistakenly 
parsing/converting/encoding your input -- allghough in theory it could be 
a 3rd party library Solr uses.  (Hardcoding the unicode string in your 
java application and passing it as a solr param should help prove/disprove 
that)

: 
: On Fri, Aug 7, 2020 at 11:53 AM Jörn Franke  wrote:
: 
: > Dear all,
: >
: > I have the following issues. I have a Solrj Client 8.6 (but it happens
: > also in previous versions), where I execute, for example, the following
: > query:
: > Jörn
: >
: > If I look into Solr Admin UI it finds all the right results.
: >
: > If I use Solrj client then it does not find anything.
: > Further, investigating in debug mode it seems that the URI to server gets
: > wrongly encoded.
: > Jörn becomes J%C3%83%C2%B6rn
: > It should become only J%C3%B6rn
: > any idea why this happens and why it add %83%C2 inbetween? Those do not
: > seem to be even valid UTF-8 characters
: >
: > I verified with various statements that I give to Solrj the correct
: > encoded String "Jörn"
: >
: > Can anyone help me here?
: >
: > Thank you.
: >
: > best regards
: >
: 

-Hoss
http://www.lucidworks.com/

Re: Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Andy Webb
hi Jörn - something's decoding a UTF8 sequence using the legacy iso-8859-1
character set:

Jörn is J%C3%B6rn in UTF8
J%C3%B6rn misinterpreted as iso-8859-1 is Jörn
Jörn is J%C3%83%C2%B6rn in UTF8

I hope this helps track down the problem!
Andy

On Fri, 7 Aug 2020 at 12:08, Jörn Franke  wrote:

> Hmm, setting -Dfile.encoding=UTF-8 solves the problem. I have to now check
> which component of the application screws it up, but at the moment I do NOT
> believe it is related to Solrj.
>
> On Fri, Aug 7, 2020 at 11:53 AM Jörn Franke  wrote:
>
> > Dear all,
> >
> > I have the following issues. I have a Solrj Client 8.6 (but it happens
> > also in previous versions), where I execute, for example, the following
> > query:
> > Jörn
> >
> > If I look into Solr Admin UI it finds all the right results.
> >
> > If I use Solrj client then it does not find anything.
> > Further, investigating in debug mode it seems that the URI to server gets
> > wrongly encoded.
> > Jörn becomes J%C3%83%C2%B6rn
> > It should become only J%C3%B6rn
> > any idea why this happens and why it add %83%C2 inbetween? Those do not
> > seem to be even valid UTF-8 characters
> >
> > I verified with various statements that I give to Solrj the correct
> > encoded String "Jörn"
> >
> > Can anyone help me here?
> >
> > Thank you.
> >
> > best regards
> >
>


Re: Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Jörn Franke
Hmm, setting -Dfile.encoding=UTF-8 solves the problem. I have to now check
which component of the application screws it up, but at the moment I do NOT
believe it is related to Solrj.

On Fri, Aug 7, 2020 at 11:53 AM Jörn Franke  wrote:

> Dear all,
>
> I have the following issues. I have a Solrj Client 8.6 (but it happens
> also in previous versions), where I execute, for example, the following
> query:
> Jörn
>
> If I look into Solr Admin UI it finds all the right results.
>
> If I use Solrj client then it does not find anything.
> Further, investigating in debug mode it seems that the URI to server gets
> wrongly encoded.
> Jörn becomes J%C3%83%C2%B6rn
> It should become only J%C3%B6rn
> any idea why this happens and why it add %83%C2 inbetween? Those do not
> seem to be even valid UTF-8 characters
>
> I verified with various statements that I give to Solrj the correct
> encoded String "Jörn"
>
> Can anyone help me here?
>
> Thank you.
>
> best regards
>


Solrj client 8.6.0 issue special characters in query

2020-08-07 Thread Jörn Franke
Dear all,

I have the following issues. I have a Solrj Client 8.6 (but it happens also
in previous versions), where I execute, for example, the following query:
Jörn

If I look into Solr Admin UI it finds all the right results.

If I use Solrj client then it does not find anything.
Further, investigating in debug mode it seems that the URI to server gets
wrongly encoded.
Jörn becomes J%C3%83%C2%B6rn
It should become only J%C3%B6rn
any idea why this happens and why it add %83%C2 inbetween? Those do not
seem to be even valid UTF-8 characters

I verified with various statements that I give to Solrj the correct encoded
String "Jörn"

Can anyone help me here?

Thank you.

best regards


Re: solrj - get metrics from all nodes

2020-07-02 Thread ChienHuaWang
Thanks for Jan's response.

I tried to set this "nodes" parameter by ModifiableSolrParams. But the null
is return from GenericSolrRequest. 
Could anyone advise the best approach to setup this parameter for multiple
nodes? 


Thanks,
Chien



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: solrj - get metrics from all nodes

2020-06-30 Thread Jan Høydahl
Use nodes=, not node=

> 30. jun. 2020 kl. 02:02 skrev ChienHuaWang :
> 
> Hi Jan,
> 
> Thanks for the response.
> Could you please share more detail how you request the metric with multiple
> nodes same time?
> I do something as below, but only get one node info, the data I'm interested
> most is, ex: CONTAINER.fs.totalSpace, CONTAINER.fs.usableSpace. etc..
> 
> 
> solr/admin/metrics?group=node=node1_name,node2_name
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: solrj - get metrics from all nodes

2020-06-30 Thread ChienHuaWang
Hi Jan,

Thanks for the response.
Could you please share more detail how you request the metric with multiple
nodes same time?
I do something as below, but only get one node info, the data I'm interested
most is, ex: CONTAINER.fs.totalSpace, CONTAINER.fs.usableSpace. etc..


solr/admin/metrics?group=node=node1_name,node2_name




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: solrj - get metrics from all nodes

2020-06-29 Thread Jan Høydahl
The admin UI does this my requesting =,,…
You will get a master response with each sub response as key:value pairs.
The list of node_names can be found in live_nodes in CLUSTERSTATUS api.

Jan

> 27. jun. 2020 kl. 02:09 skrev ChienHuaWang :
> 
> For people who is also looking for the solution - you can append
> "node=node_name" in metrics request to get specific data of node. 
> If anyone know how to get the data if all the nodes together, please kindly
> share, thanks.
> 
> 
> Regards,
> Chien
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: solrj - get metrics from all nodes

2020-06-27 Thread ChienHuaWang
For people who is also looking for the solution - you can append
"node=node_name" in metrics request to get specific data of node. 
If anyone know how to get the data if all the nodes together, please kindly
share, thanks.


Regards,
Chien



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: solrj - get metrics from all nodes

2020-06-25 Thread ChienHuaWang
I observed the exactly same thing - the metrics for only one node. 
Looking for the solution to get the metrics of all the nodes.
Could anyone advice?

Thanks,
Chien



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Gettings interestingTerms from solr.MoreLikeThisHandler using SolrJ

2020-06-19 Thread Shawn Heisey

On 6/18/2020 5:31 AM, Zander, Sebastian wrote:

In the returning QueryResponse I can't find the interestingTerms.
I would really like to grab it on this way, before calling another time.
Any advices? I'm running solr 8.5.2


If you can send the full json or XML response, I think I can show you 
how to parse it with SolrJ.  I don't have easy access to production Solr 
servers, so it's a little difficult for me to try it out myself.


Thanks,
Shawn


Gettings interestingTerms from solr.MoreLikeThisHandler using SolrJ

2020-06-18 Thread Zander, Sebastian
Hello solr-fellows,

i'm currently implementing the MoreLikeThis Feature in an e-commerce platform.
I setup my solr.MoreLikeThisHandler in my solrconfig.xml like this:


  


aid, eans, desclong
list
true
10
aid, eans, desclong
0
  


Running the following command in browser 
http://localhost:8983/solr/ecom/mlt?q=aid:1
It returns with 10 docs in his response and a list of interestingTerms.

Running the same via solrj, I can't find the interestingTerms.
I setup my SolrQuery like this:

SolrQuery sq = new SolrQuery();
sq.setRequestHandler("/mlt");
sq.setQuery("aid:" + p_aid);

In the returning QueryResponse I can't find the interestingTerms.
I would really like to grab it on this way, before calling another time.
Any advices? I'm running solr 8.5.2

Thanks in advance,
Sebastian Zander



RE: Timeout issue while doing update operations from clients (using SolrJ)

2020-06-12 Thread Kommu, Vinodh K.
I feel the mentioned issue is more or less relavant to following JIRA. Any idea 
on this?

https://issues.apache.org/jira/browse/SOLR-13458

Thanks & Regards,
Vinodh

From: Kommu, Vinodh K.
Sent: Wednesday, June 10, 2020 10:43 PM
To: solr-user@lucene.apache.org
Subject: RE: Timeout issue while doing update operations from clients (using 
SolrJ)

We are getting following socket timeout exception during this error. Any idea 
on this?

ERROR (updateExecutor-3-thread-1392-processing-n:hostname:1100_solr 
x:TestCollection_shard6_replica_n10 c:TestCollection s:shard6 r:core_node13) 
[c:TestCollection s:shard6 r:core_node13 x:TestCollection_shard6_replica_n10] 
o.a.s.u.SolrCmdDistributor org.apache.solr.client.solrj.SolrServerException: 
Timeout occured while waiting response from server at: 
https://hostname:1100/solr/TestCollection_shard6_replica_n34
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.request(ConcurrentUpdateSolrClient.java:491)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
at 
org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:326)
at 
org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:315)
at 
org.apache.solr.update.SolrCmdDistributor.dt_access$675(SolrCmdDistributor.java)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$303(ExecutorUtil.java)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:120)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.jav

RE: Timeout issue while doing update operations from clients (using SolrJ)

2020-06-11 Thread Kommu, Vinodh K.
Hi,

Can someone shed some light on this issue please?


Regards,
Vinodh Kumar K
Middleware Cache and Search Engineering
DTCC Chennai


[cid:image006.png@01D2A70B.AF8789E0]

From: Kommu, Vinodh K.
Sent: Wednesday, June 10, 2020 10:43 PM
To: solr-user@lucene.apache.org
Subject: RE: Timeout issue while doing update operations from clients (using 
SolrJ)

We are getting following socket timeout exception during this error. Any idea 
on this?

ERROR (updateExecutor-3-thread-1392-processing-n:hostname:1100_solr 
x:TestCollection_shard6_replica_n10 c:TestCollection s:shard6 r:core_node13) 
[c:TestCollection s:shard6 r:core_node13 x:TestCollection_shard6_replica_n10] 
o.a.s.u.SolrCmdDistributor org.apache.solr.client.solrj.SolrServerException: 
Timeout occured while waiting response from server at: 
https://hostname:1100/solr/TestCollection_shard6_replica_n34
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.request(ConcurrentUpdateSolrClient.java:491)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
at 
org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:326)
at 
org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:315)
at 
org.apache.solr.update.SolrCmdDistributor.dt_access$675(SolrCmdDistributor.java)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$303(ExecutorUtil.java)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:120)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56

RE: Timeout issue while doing update operations from clients (using SolrJ)

2020-06-10 Thread Kommu, Vinodh K.
We are getting following socket timeout exception during this error. Any idea 
on this?

ERROR (updateExecutor-3-thread-1392-processing-n:hostname:1100_solr 
x:TestCollection_shard6_replica_n10 c:TestCollection s:shard6 r:core_node13) 
[c:TestCollection s:shard6 r:core_node13 x:TestCollection_shard6_replica_n10] 
o.a.s.u.SolrCmdDistributor org.apache.solr.client.solrj.SolrServerException: 
Timeout occured while waiting response from server at: 
https://hostname:1100/solr/TestCollection_shard6_replica_n34
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.request(ConcurrentUpdateSolrClient.java:491)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
at 
org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:326)
at 
org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:315)
at 
org.apache.solr.update.SolrCmdDistributor.dt_access$675(SolrCmdDistributor.java)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$303(ExecutorUtil.java)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:120)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:542)

Thanks & Regards,
Vinodh

From: Kommu, Vinodh K.
Sent: Wednesday, June 10, 2020 3:41 PM
To: solr-user@lucene.apache.org
Subject: Timeout issue while doing update operations from clients (using SolrJ)

Hi,

Need some help in fixing intermittent timeout issue please. Recently we came 
ac

Timeout issue while doing update operations from clients (using SolrJ)

2020-06-10 Thread Kommu, Vinodh K.
Hi,

Need some help in fixing intermittent timeout issue please. Recently we came 
across this timeout issue during QA performance testing when a streaming 
expression query which runs on a larger set of data (~60-80 million) from a 
client using solrJ, was timing out exactly in 2mins. Later this issue was fixed 
after bumping up idle timeout property default value from "6"ms to 
"60"ms (10mins). Now are getting timeout exceptions again when update and 
delete operations are happening. To fix this, we have increased following 
timeout settings in solr.xml file across all solr nodes.


  ${distribUpdateSoTimeout:60}

  ${distribUpdateConnTimeout:60}

  ${socketTimeout:60}

  ${connTimeout:60}


However even after increasing above timeout properties to 10mins still seeing 
timeout exceptions intermittently. Any other setting needs to update/change in 
solr or zookeeper or in client? Any suggestions?


Thanks & Regards,
Vinodh

DTCC DISCLAIMER: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify us 
immediately and delete the email and any attachments from your system. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email.


solrj - get metrics from all nodes

2020-06-03 Thread lstusr 5u93n4
Hi All,

I'm attempting to connect to the metrics api in solrj to query metrics from
my cluster. Using the CloudSolrClient, I get routed to one node, and get
metrics only from that node.

I'm building my request like this:

GenericSolrRequest req = new GenericSolrRequest(METHOD.GET,
"/admin/metrics", new MapSolrParams(params));

 NamedList resp = getCloudSolrClient().request(req);

And this returns metrics only from the node that gets selected by the
LbHttpClient (I think).

Is there an easy way to query all of the nodes for their metrics in solrj?

Kyle


Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-05-02 Thread Samuel Garcia Martinez
I created two different issues: one for the Content-Type issue in the server 
and another one for the reliability issue in the SolrClient for 
unexpected/malformed responses.

ContentType issue: https://issues.apache.org/jira/browse/SOLR-14456
Client issue: https://issues.apache.org/jira/browse/SOLR-14457

From: Jason Gerlowski 
Sent: Wednesday, April 22, 2020 4:43 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression 
enabled

Hi Samuel,

Thanks for the very detailed description of the problem here.  Very
thorough!  I don't think you're missing anything obvious, please file the
jira tickets if you haven't already.

Best,

Jason

On Mon, Apr 13, 2020 at 6:12 PM Samuel Garcia Martinez <
samuel...@inditex.com> wrote:

> Reading again the last two paragraphs I realized that, those two
> specially, are very poorly worded (grammar ). I tried to rephrase them
> and correct some of the errors below.
>
> Here I can see three different problems:
>
> * HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to
> set the Content-Encoding header. This is obviously a mistake.
> * HttpSolrClient, specifically the HttpClientUtil, should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> the connection is leaked forever. It should the exception though.
> * HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to be blocked forever
> waiting for a connection to be available. This way, the application could
> respond to requests that won’t use Solr instead of rejecting any incoming
> requests because all threads are blocked forever for a connection that
> won’t be available ever.
>
> I think the two first points are bugs that should be fixed.  The third one
> is a feature improvement to me.
>
> Unless I missed something, I'll file the two bugs and provide a patch for
> them. The same goes for the the feature improvement.
>
>
>
> Get Outlook for 
> iOS<https://clicktime.symantec.com/3HCbuRyy1nsrbJk46YT1vS76H2?u=https%3A%2F%2Faka.ms%2Fo0ukef>
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma vía, proceda a su eliminación y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>
> ____
> From: Samuel Garcia Martinez 
> Sent: Monday, April 13, 2020 10:08:36 PM
> To: solr-user@lucene.apache.orG 
> Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression
> enabled
>
> Hi!
>
> Today, I've seen a weird issue in production workloads when the gzip
> compression was enabled. After some minutes, the client app ran out of
> connections and stopped responding.
>
> The cluster setup is pretty simple:
> Solr version: 7.7.2
> Solr cloud enabled
> Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas.
> 1 HTTP LB using Round Robin over all nodes
> All cluster nodes have gzip enabled for all paths, all HTTP verbs and all
> MIME types.
> Solr client: HttpSolrClient targeting the HTTP LB
>
> Problem description: when the Solr node that receives the request has to
> forward the request to a Solr Node that actually can perform the query, the
> response headers are added incorrectly to the client response, causing the
> SolrJ client to fail and to never release the connection back to the pool.
>
> To simplify the case, let's try to start from the following repro scenario:
>
>   *   Start one node with cloud mode and port 8983
>   *   Create one single collection (1 shard, 1 replica)
>   *   Start another node with port 8984 and the previusly started zk (-z
> localhost:9983)
>   *   Start a java application and query the cluster using the node on
> port 8984 (the one that doesn't host the collection)
>
> So, the steps occur like:
>
>   *   The application queries node:8984 with compression enabled
> ("Accept-Encoding: gzip") and wt=javabin
>   *   Node:8984 can't perform the query and creates a http request behind
> the scenes to node:8983
>   *   Node:8983 returns a gzipped response with "Content-Encoding: gzip"
> and "Content-Type: application/octet-stream"
>   *   Node:8984 adds the "Content-Encoding: gzip" header as character
> stream to the response (it should be forwarded as "Content-Encoding"
> header, not character encoding)
>   *   HttpSolrClient receives a "Content-Type:
> application/octet-stream;charset=gzip", causing an exception.
>   *   HttpSolr

Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-04-22 Thread Jason Gerlowski
Hi Samuel,

Thanks for the very detailed description of the problem here.  Very
thorough!  I don't think you're missing anything obvious, please file the
jira tickets if you haven't already.

Best,

Jason

On Mon, Apr 13, 2020 at 6:12 PM Samuel Garcia Martinez <
samuel...@inditex.com> wrote:

> Reading again the last two paragraphs I realized that, those two
> specially, are very poorly worded (grammar ). I tried to rephrase them
> and correct some of the errors below.
>
> Here I can see three different problems:
>
> * HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to
> set the Content-Encoding header. This is obviously a mistake.
> * HttpSolrClient, specifically the HttpClientUtil, should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> the connection is leaked forever. It should the exception though.
> * HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to be blocked forever
> waiting for a connection to be available. This way, the application could
> respond to requests that won’t use Solr instead of rejecting any incoming
> requests because all threads are blocked forever for a connection that
> won’t be available ever.
>
> I think the two first points are bugs that should be fixed.  The third one
> is a feature improvement to me.
>
> Unless I missed something, I'll file the two bugs and provide a patch for
> them. The same goes for the the feature improvement.
>
>
>
> Get Outlook for iOS<https://aka.ms/o0ukef>
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma vía, proceda a su eliminación y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>
> ____
> From: Samuel Garcia Martinez 
> Sent: Monday, April 13, 2020 10:08:36 PM
> To: solr-user@lucene.apache.orG 
> Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression
> enabled
>
> Hi!
>
> Today, I've seen a weird issue in production workloads when the gzip
> compression was enabled. After some minutes, the client app ran out of
> connections and stopped responding.
>
> The cluster setup is pretty simple:
> Solr version: 7.7.2
> Solr cloud enabled
> Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas.
> 1 HTTP LB using Round Robin over all nodes
> All cluster nodes have gzip enabled for all paths, all HTTP verbs and all
> MIME types.
> Solr client: HttpSolrClient targeting the HTTP LB
>
> Problem description: when the Solr node that receives the request has to
> forward the request to a Solr Node that actually can perform the query, the
> response headers are added incorrectly to the client response, causing the
> SolrJ client to fail and to never release the connection back to the pool.
>
> To simplify the case, let's try to start from the following repro scenario:
>
>   *   Start one node with cloud mode and port 8983
>   *   Create one single collection (1 shard, 1 replica)
>   *   Start another node with port 8984 and the previusly started zk (-z
> localhost:9983)
>   *   Start a java application and query the cluster using the node on
> port 8984 (the one that doesn't host the collection)
>
> So, the steps occur like:
>
>   *   The application queries node:8984 with compression enabled
> ("Accept-Encoding: gzip") and wt=javabin
>   *   Node:8984 can't perform the query and creates a http request behind
> the scenes to node:8983
>   *   Node:8983 returns a gzipped response with "Content-Encoding: gzip"
> and "Content-Type: application/octet-stream"
>   *   Node:8984 adds the "Content-Encoding: gzip" header as character
> stream to the response (it should be forwarded as "Content-Encoding"
> header, not character encoding)
>   *   HttpSolrClient receives a "Content-Type:
> application/octet-stream;charset=gzip", causing an exception.
>   *   HttpSolrClient tries to quietly close the connection, but since the
> stream is broken, the Utils.consumeFully fails to actually consume the
> entity (it throws another exception in GzipDecompressingEntity#getContent()
> with "not in GZIP format")
>
> The exception thrown by HttpSolrClient is:
> java.nio.charset.UnsupportedCharsetException: gzip
>at java.nio.charset.Charset.forName(Charset.java:531)
>at
> org.apache.http.entity.ContentType.create(ContentType.java:271)
>  

Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-04-13 Thread Samuel Garcia Martinez
Reading again the last two paragraphs I realized that, those two specially, are 
very poorly worded (grammar ). I tried to rephrase them and correct some of 
the errors below.

Here I can see three different problems:

* HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to set 
the Content-Encoding header. This is obviously a mistake.
* HttpSolrClient, specifically the HttpClientUtil, should be modified to 
prevent that if the Content-Encoding header lies about the actual content, the 
connection is leaked forever. It should the exception though.
* HttpSolrClient should allow clients to customize HttpClient's 
connectionRequestTimeout, preventing the application to be blocked forever 
waiting for a connection to be available. This way, the application could 
respond to requests that won’t use Solr instead of rejecting any incoming 
requests because all threads are blocked forever for a connection that won’t be 
available ever.

I think the two first points are bugs that should be fixed.  The third one is a 
feature improvement to me.

Unless I missed something, I'll file the two bugs and provide a patch for them. 
The same goes for the the feature improvement.



Get Outlook for iOS<https://aka.ms/o0ukef>



En el caso de haber recibido este mensaje por error, le rogamos que nos lo 
comunique por esta misma vía, proceda a su eliminación y se abstenga de 
utilizarlo en modo alguno.
If you receive this message by error, please notify the sender by return e-mail 
and delete it. Its use is forbidden.




From: Samuel Garcia Martinez 
Sent: Monday, April 13, 2020 10:08:36 PM
To: solr-user@lucene.apache.orG 
Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

Hi!

Today, I've seen a weird issue in production workloads when the gzip 
compression was enabled. After some minutes, the client app ran out of 
connections and stopped responding.

The cluster setup is pretty simple:
Solr version: 7.7.2
Solr cloud enabled
Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas. 1 
HTTP LB using Round Robin over all nodes
All cluster nodes have gzip enabled for all paths, all HTTP verbs and all MIME 
types.
Solr client: HttpSolrClient targeting the HTTP LB

Problem description: when the Solr node that receives the request has to 
forward the request to a Solr Node that actually can perform the query, the 
response headers are added incorrectly to the client response, causing the 
SolrJ client to fail and to never release the connection back to the pool.

To simplify the case, let's try to start from the following repro scenario:

  *   Start one node with cloud mode and port 8983
  *   Create one single collection (1 shard, 1 replica)
  *   Start another node with port 8984 and the previusly started zk (-z 
localhost:9983)
  *   Start a java application and query the cluster using the node on port 
8984 (the one that doesn't host the collection)

So, the steps occur like:

  *   The application queries node:8984 with compression enabled 
("Accept-Encoding: gzip") and wt=javabin
  *   Node:8984 can't perform the query and creates a http request behind the 
scenes to node:8983
  *   Node:8983 returns a gzipped response with "Content-Encoding: gzip" and 
"Content-Type: application/octet-stream"
  *   Node:8984 adds the "Content-Encoding: gzip" header as character stream to 
the response (it should be forwarded as "Content-Encoding" header, not 
character encoding)
  *   HttpSolrClient receives a "Content-Type: 
application/octet-stream;charset=gzip", causing an exception.
  *   HttpSolrClient tries to quietly close the connection, but since the 
stream is broken, the Utils.consumeFully fails to actually consume the entity 
(it throws another exception in GzipDecompressingEntity#getContent() with "not 
in GZIP format")

The exception thrown by HttpSolrClient is:
java.nio.charset.UnsupportedCharsetException: gzip
   at java.nio.charset.Charset.forName(Charset.java:531)
   at 
org.apache.http.entity.ContentType.create(ContentType.java:271)
   at 
org.apache.http.entity.ContentType.create(ContentType.java:261)
   at org.apache.http.entity.ContentType.parse(ContentType.java:319)
   at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
   at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
   at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
   at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
   at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1015)
   at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1031)
   at 
org.

SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-04-13 Thread Samuel Garcia Martinez
Hi!

Today, I've seen a weird issue in production workloads when the gzip 
compression was enabled. After some minutes, the client app ran out of 
connections and stopped responding.

The cluster setup is pretty simple:
Solr version: 7.7.2
Solr cloud enabled
Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas. 1 
HTTP LB using Round Robin over all nodes
All cluster nodes have gzip enabled for all paths, all HTTP verbs and all MIME 
types.
Solr client: HttpSolrClient targeting the HTTP LB

Problem description: when the Solr node that receives the request has to 
forward the request to a Solr Node that actually can perform the query, the 
response headers are added incorrectly to the client response, causing the 
SolrJ client to fail and to never release the connection back to the pool.

To simplify the case, let's try to start from the following repro scenario:

  *   Start one node with cloud mode and port 8983
  *   Create one single collection (1 shard, 1 replica)
  *   Start another node with port 8984 and the previusly started zk (-z 
localhost:9983)
  *   Start a java application and query the cluster using the node on port 
8984 (the one that doesn't host the collection)

So, the steps occur like:

  *   The application queries node:8984 with compression enabled 
("Accept-Encoding: gzip") and wt=javabin
  *   Node:8984 can't perform the query and creates a http request behind the 
scenes to node:8983
  *   Node:8983 returns a gzipped response with "Content-Encoding: gzip" and 
"Content-Type: application/octet-stream"
  *   Node:8984 adds the "Content-Encoding: gzip" header as character stream to 
the response (it should be forwarded as "Content-Encoding" header, not 
character encoding)
  *   HttpSolrClient receives a "Content-Type: 
application/octet-stream;charset=gzip", causing an exception.
  *   HttpSolrClient tries to quietly close the connection, but since the 
stream is broken, the Utils.consumeFully fails to actually consume the entity 
(it throws another exception in GzipDecompressingEntity#getContent() with "not 
in GZIP format")

The exception thrown by HttpSolrClient is:
java.nio.charset.UnsupportedCharsetException: gzip
   at java.nio.charset.Charset.forName(Charset.java:531)
   at 
org.apache.http.entity.ContentType.create(ContentType.java:271)
   at 
org.apache.http.entity.ContentType.create(ContentType.java:261)
   at org.apache.http.entity.ContentType.parse(ContentType.java:319)
   at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
   at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
   at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
   at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
   at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1015)
   at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1031)
   at 
org.apache.solr.client.solrj.SolrClient$$FastClassBySpringCGLIB$$7fcf36a0.invoke()
   at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)

Here I can see three different problems:

  *   HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to 
set the Content-Encoding header. This is obviously a typo.
  *   HttpSolrClient, specially the HttpClientUtil should be modified to 
prevent that if the Content-Encoding header lies about the actual content, 
there should be an exception, but shouldn't leak the connection forever.
  *   HttpSolrClient should allow clients to customize HttpClient's 
connectionRequestTimeout, preventing the application to respond to any other 
incoming request because all requests that used could be forever blocked 
waiting for a free connection that will never be free.

I think the two points are to bugs and the third one is a feature improvement. 
Unless I missed something, I'll file the two bugs and provide a patch for them. 
The same goes for the the feature improvement.



En el caso de haber recibido este mensaje por error, le rogamos que nos lo 
comunique por esta misma v?a, proceda a su eliminaci?n y se abstenga de 
utilizarlo en modo alguno.
If you receive this message by error, please notify the sender by return e-mail 
and delete it. Its use is forbidden.




Queries on adding headers to solrj Request

2020-04-13 Thread dinesh naik
Hi all,
We are planning to add security to Solr using . For this we are adding few
information in the headers of each SolrJ Request. These request will be
intercepted by some application (proxy) in the Solr VM and then route it to
Solr ( Considering Solr port as 8983 ) .
Could you please answer below queries:
 1. Are there any API ( Path ) that Solr Client cannot access and only Solr
uses for Intra node communication?
 2. As the SolrJ client will add headers, Intra communication from Solr
also needs to add these headers ( like ping request from Solr1 Node to
Solr2 Node ). Could Solr add custom headers for intra node communication?
 3. Apart from 8983 node, are there any other ports Solr is using for intra
node communication?
 4. how to add headers to CloudSolrClient ?

-- 
Best Regards,
Dinesh Naik


Re: how to add multiple value for a filter query in Solrj

2020-03-24 Thread Erick Erickson
Your original formation of the filter query has two problems:

1> you included a “+” in the value. My guess is that you misinterpreted the 
 URL you got back from the browser in the admin UI where a “+” is a 
 URL-encoded space. You’ll also see a bunch of %XX in the URL which are
 other encodings.

2> you include double quotes, which can change things to be phrase queries.

Looking at the debug version would have helped you pinpoint these.


> On Mar 24, 2020, at 5:34 AM, Szűcs Roland  wrote:
> 
> Hi All,
> 
> I use Solr 8.4.1 and the latest solrj client.
> There is a field let's which can have 3 different values. If I use the
> admin UI, I write to the fq the following: filterName:"value1"
> filterName:"value2" and it is working as expected.
> If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
> addFilterQuery(filterName+":\""+value1+"\"");
> addFilterQuery(filterName+":\""+value2+"\"");
> I got no any document back.
> 
> Can somebody help me what syntax is appropriate with solrj to add filter
> queries one by one if there is one filter field but multiple values?
> 
> Thanks,
> 
> Roland



Re: how to add multiple value for a filter query in Solrj

2020-03-24 Thread Szűcs Roland
Thanks Avi, it worked.

Raboah, Avi  ezt írta (időpont: 2020. márc. 24., K,
11:08):

> You can do something like that if we are talking on the same filter query
> name.
>
> addFilterQuery(String.format("%s:(%s %s)", filterName, value1, value2));
>
>
> -Original Message-
> From: Szűcs Roland 
> Sent: Tuesday, March 24, 2020 11:35 AM
> To: solr-user@lucene.apache.org
> Subject: how to add multiple value for a filter query in Solrj
>
> Hi All,
>
> I use Solr 8.4.1 and the latest solrj client.
> There is a field let's which can have 3 different values. If I use the
> admin UI, I write to the fq the following: filterName:"value1"
> filterName:"value2" and it is working as expected.
> If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
> addFilterQuery(filterName+":\""+value1+"\"");
> addFilterQuery(filterName+":\""+value2+"\"");
> I got no any document back.
>
> Can somebody help me what syntax is appropriate with solrj to add filter
> queries one by one if there is one filter field but multiple values?
>
> Thanks,
>
> Roland
>
>
> This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or subsidiaries. The
> information is intended to be for the use of the individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may not
> use, copy, disclose or distribute to anyone this message or any information
> contained in this message. If you have received this electronic message in
> error, please notify us by replying to this e-mail.
>


RE: how to add multiple value for a filter query in Solrj

2020-03-24 Thread Raboah, Avi
You can do something like that if we are talking on the same filter query name.

addFilterQuery(String.format("%s:(%s %s)", filterName, value1, value2));


-Original Message-
From: Szűcs Roland 
Sent: Tuesday, March 24, 2020 11:35 AM
To: solr-user@lucene.apache.org
Subject: how to add multiple value for a filter query in Solrj

Hi All,

I use Solr 8.4.1 and the latest solrj client.
There is a field let's which can have 3 different values. If I use the admin 
UI, I write to the fq the following: filterName:"value1"
filterName:"value2" and it is working as expected.
If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
addFilterQuery(filterName+":\""+value1+"\"");
addFilterQuery(filterName+":\""+value2+"\"");
I got no any document back.

Can somebody help me what syntax is appropriate with solrj to add filter 
queries one by one if there is one filter field but multiple values?

Thanks,

Roland


This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.


how to add multiple value for a filter query in Solrj

2020-03-24 Thread Szűcs Roland
Hi All,

I use Solr 8.4.1 and the latest solrj client.
There is a field let's which can have 3 different values. If I use the
admin UI, I write to the fq the following: filterName:"value1"
filterName:"value2" and it is working as expected.
If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
addFilterQuery(filterName+":\""+value1+"\"");
addFilterQuery(filterName+":\""+value2+"\"");
I got no any document back.

Can somebody help me what syntax is appropriate with solrj to add filter
queries one by one if there is one filter field but multiple values?

Thanks,

Roland


Antw: Re: SolrJ 8.2: Too many Connection evictor threads

2020-02-11 Thread Andreas Kahl
Erick, 


Thanks, that's why we want to upgrade our clients to the same Solr(J) version 
as the server has. But I am still fighting the uncontrolled creation of those 
Connection evictor threads in my tomcat. 


Best Regards

Andreas


>>> Erick Erickson  11.02.20 15.06 Uhr >>>
Are you running a 5x SolrJ client against an 8x server? There’s no
guarantee at all that that would work (or vice-versa for that matter).

Most generally, SolrJ clients should be able to work with version X-1, but X-3
is unsupported.

Best,
Erick

> On Feb 11, 2020, at 6:36 AM, Andreas Kahl  wrote:
> 
> Hello everyone, 
> 
> we just updated our Solr from former 5.4 to 8.2. The server runs fine,
> but in our client applications we are seeing issues with thousands of
> threads created with the name "Connection evictor". 
> Can you give a hint how to limit those threads? 
> Should we better use HttpSolrClient or Http2SolrClient?
> Is another version of SolrJ advisable?
> 
> Thanks & Best Regards
> Andreas
> 




Re: SolrJ 8.2: Too many Connection evictor threads

2020-02-11 Thread Erick Erickson
Are you running a 5x SolrJ client against an 8x server? There’s no
guarantee at all that that would work (or vice-versa for that matter).

Most generally, SolrJ clients should be able to work with version X-1, but X-3
is unsupported.

Best,
Erick

> On Feb 11, 2020, at 6:36 AM, Andreas Kahl  wrote:
> 
> Hello everyone, 
> 
> we just updated our Solr from former 5.4 to 8.2. The server runs fine,
> but in our client applications we are seeing issues with thousands of
> threads created with the name "Connection evictor". 
> Can you give a hint how to limit those threads? 
> Should we better use HttpSolrClient or Http2SolrClient?
> Is another version of SolrJ advisable?
> 
> Thanks & Best Regards
> Andreas
> 



SolrJ 8.2: Too many Connection evictor threads

2020-02-11 Thread Andreas Kahl
Hello everyone, 

we just updated our Solr from former 5.4 to 8.2. The server runs fine,
but in our client applications we are seeing issues with thousands of
threads created with the name "Connection evictor". 
Can you give a hint how to limit those threads? 
Should we better use HttpSolrClient or Http2SolrClient?
Is another version of SolrJ advisable?

Thanks & Best Regards
Andreas



Re: Atomic solrj update

2019-12-15 Thread Paras Lehana
Hi Prem,

Using HTTPClient to establish connection and also i am *validating* whether
> the particular document is *available* in collection or not and after that
> updating the document.


 Why do you need to validate the particular document before updating.
Atomic updates either update the document if it's already available or
create the document if it's not. I guess you don't want to create the
document if it doesn't exist, right?



On Fri, 13 Dec 2019 at 11:42, Shawn Heisey  wrote:

> On 12/12/2019 10:00 PM, Prem wrote:
> > I am trying to partially update of 50M data in a collection from CSV
> using
> > Atomic script(solrj).But it is taking 2 hrs for 1M records.is there
> anyway i
> > can speed up my update.
>
> How many documents are you sending in one request?
>
> > Using HTTPClient to establish connection and also i am validating whether
> > the particular document is available in collection or not and after that
> > updating the document.
>
> I thought you were using SolrJ ... but here you say you're using
> HTTPClient.
>
> Can you share your code?  What Solr server version? If you're using
> SolrJ, what version of that?
>
> If your program checks whether every single document already exists
> before sending an update, that is going to be quite slow.
>
> Thanks,
> Shawn
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>


Re: Atomic solrj update

2019-12-12 Thread Shawn Heisey

On 12/12/2019 10:00 PM, Prem wrote:

I am trying to partially update of 50M data in a collection from CSV using
Atomic script(solrj).But it is taking 2 hrs for 1M records.is there anyway i
can speed up my update.


How many documents are you sending in one request?


Using HTTPClient to establish connection and also i am validating whether
the particular document is available in collection or not and after that
updating the document.


I thought you were using SolrJ ... but here you say you're using HTTPClient.

Can you share your code?  What Solr server version? If you're using 
SolrJ, what version of that?


If your program checks whether every single document already exists 
before sending an update, that is going to be quite slow.


Thanks,
Shawn


Re: Atomic solrj update

2019-12-12 Thread Jörn Franke
One needs to see the code or get more insights on your design. Do you reuse the 
HTTPClient or do you create for every request a new one?
How often do you commit?
Do you do parallel updates from the client (multiple threads?).

> Am 13.12.2019 um 06:56 schrieb Prem :
> 
> I am trying to partially update of 50M data in a collection from CSV using
> Atomic script(solrj).But it is taking 2 hrs for 1M records.is there anyway i
> can speed up my update.
> Using HTTPClient to establish connection and also i am validating whether
> the particular document is available in collection or not and after that
> updating the document.
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Atomic solrj update

2019-12-12 Thread Prem
I am trying to partially update of 50M data in a collection from CSV using
Atomic script(solrj).But it is taking 2 hrs for 1M records.is there anyway i
can speed up my update.
Using HTTPClient to establish connection and also i am validating whether
the particular document is available in collection or not and after that
updating the document.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr 8.3 Solrj streaming expressions do not return all field values

2019-11-06 Thread Jörn Franke
I created a JIRA for this:
https://issues.apache.org/jira/browse/SOLR-13894

On Wed, Nov 6, 2019 at 10:45 AM Jörn Franke  wrote:

> I have checked now Solr 8.3 server in admin UI. Same issue.
>
> Reproduction:
> select(search(testcollection,q=“test”,df=“Default”,defType=“edismax”,fl=“id”,
> qt=“/export”, sort=“id asc”),id,if(eq(1,1),Y,N) as found)
>
> In 8.3 it returns only the id field.
> In 8.2 it returns id,found field.
>
> Since found is generated by select (and not coming from the collection)
> there must be an issue with select.
>
> Any idea why this is happening.
>
> Debug logs do not show any error and the expression is correctly received
> by Solr.
>
> Thank you.
>
> Best regards
>
> > Am 05.11.2019 um 14:59 schrieb Jörn Franke :
> >
> > Thanks I will check and come back to you. As far as I remember (but
> have to check) the queries generated by Solr were correct
> >
> > Just to be clear the same thing works with Solr 8.2 server and Solr 8.2
> client.
> >
> > It show the odd behaviour with Solr 8.2 server and Solr 8.3 client.
> >
> >> Am 05.11.2019 um 14:49 schrieb Joel Bernstein :
> >>
> >> I'll probably need some more details. One thing that's useful is to
> look at
> >> the logs and see the underlying Solr queries that are generated. Then
> try
> >> those underlying queries against the Solr index and see what comes
> back. If
> >> you're not seeing the fields with the plain Solr queries then we know
> it's
> >> something going on below streaming expressions. If you are seeing the
> >> fields then it's the expressions themselves that are not handling the
> data
> >> as expected.
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >>
>  On Mon, Nov 4, 2019 at 9:09 AM Jörn Franke 
> wrote:
> >>>
> >>> Most likely this issue can bei also reproduced in the admin UI for the
> >>> streaming handler of a collection.
> >>>
> > Am 04.11.2019 um 13:32 schrieb Jörn Franke :
> 
>  Hi,
> 
>  I use streaming expressions, e.g.
>  Sort(Select(search(...),id,if(eq(1,1),Y,N) as found), by=“field A
> asc”)
>  (Using export handler, sort is not really mandatory , I will remove it
> >>> later anyway)
> 
>  This works perfectly fine if I use Solr 8.2.0 (server + client). It
> >>> returns Tuples in the form { “id”,”12345”, “found”:”Y”}
> 
>  However, if I use Solr 8.2.0 as server and Solr 8.3.0 as client then
> the
> >>> above statement only returns the id field, but not the found field.
> 
>  Questions:
>  1) is this expected behavior, ie Solr client 8.3.0 is in this case not
> >>> compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix
> this?
>  2) has the syntax for the above expression changed? If so how?
>  3) is this not expected behavior and I should create a Jira for it?
> 
>  Thank you.
>  Best regards
> >>>
>


Re: Solr 8.3 Solrj streaming expressions do not return all field values

2019-11-06 Thread Jörn Franke
I have checked now Solr 8.3 server in admin UI. Same issue.

Reproduction:
select(search(testcollection,q=“test”,df=“Default”,defType=“edismax”,fl=“id”, 
qt=“/export”, sort=“id asc”),id,if(eq(1,1),Y,N) as found)

In 8.3 it returns only the id field.
In 8.2 it returns id,found field.

Since found is generated by select (and not coming from the collection) there 
must be an issue with select. 

Any idea why this is happening.

Debug logs do not show any error and the expression is correctly received by 
Solr.

Thank you.

Best regards

> Am 05.11.2019 um 14:59 schrieb Jörn Franke :
> 
> Thanks I will check and come back to you. As far as I remember (but have to 
> check) the queries generated by Solr were correct
> 
> Just to be clear the same thing works with Solr 8.2 server and Solr 8.2 
> client.
> 
> It show the odd behaviour with Solr 8.2 server and Solr 8.3 client.
> 
>> Am 05.11.2019 um 14:49 schrieb Joel Bernstein :
>> 
>> I'll probably need some more details. One thing that's useful is to look at
>> the logs and see the underlying Solr queries that are generated. Then try
>> those underlying queries against the Solr index and see what comes back. If
>> you're not seeing the fields with the plain Solr queries then we know it's
>> something going on below streaming expressions. If you are seeing the
>> fields then it's the expressions themselves that are not handling the data
>> as expected.
>> 
>> 
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>> 
>> 
 On Mon, Nov 4, 2019 at 9:09 AM Jörn Franke  wrote:
>>> 
>>> Most likely this issue can bei also reproduced in the admin UI for the
>>> streaming handler of a collection.
>>> 
> Am 04.11.2019 um 13:32 schrieb Jörn Franke :
 
 Hi,
 
 I use streaming expressions, e.g.
 Sort(Select(search(...),id,if(eq(1,1),Y,N) as found), by=“field A asc”)
 (Using export handler, sort is not really mandatory , I will remove it
>>> later anyway)
 
 This works perfectly fine if I use Solr 8.2.0 (server + client). It
>>> returns Tuples in the form { “id”,”12345”, “found”:”Y”}
 
 However, if I use Solr 8.2.0 as server and Solr 8.3.0 as client then the
>>> above statement only returns the id field, but not the found field.
 
 Questions:
 1) is this expected behavior, ie Solr client 8.3.0 is in this case not
>>> compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix this?
 2) has the syntax for the above expression changed? If so how?
 3) is this not expected behavior and I should create a Jira for it?
 
 Thank you.
 Best regards
>>> 


Re: Solr 8.3 Solrj streaming expressions do not return all field values

2019-11-05 Thread Jörn Franke
Thanks I will check and come back to you. As far as I remember (but have to 
check) the queries generated by Solr were correct

Just to be clear the same thing works with Solr 8.2 server and Solr 8.2 client.

It show the odd behaviour with Solr 8.2 server and Solr 8.3 client.

> Am 05.11.2019 um 14:49 schrieb Joel Bernstein :
> 
> I'll probably need some more details. One thing that's useful is to look at
> the logs and see the underlying Solr queries that are generated. Then try
> those underlying queries against the Solr index and see what comes back. If
> you're not seeing the fields with the plain Solr queries then we know it's
> something going on below streaming expressions. If you are seeing the
> fields then it's the expressions themselves that are not handling the data
> as expected.
> 
> 
> Joel Bernstein
> http://joelsolr.blogspot.com/
> 
> 
>> On Mon, Nov 4, 2019 at 9:09 AM Jörn Franke  wrote:
>> 
>> Most likely this issue can bei also reproduced in the admin UI for the
>> streaming handler of a collection.
>> 
 Am 04.11.2019 um 13:32 schrieb Jörn Franke :
>>> 
>>> Hi,
>>> 
>>> I use streaming expressions, e.g.
>>> Sort(Select(search(...),id,if(eq(1,1),Y,N) as found), by=“field A asc”)
>>> (Using export handler, sort is not really mandatory , I will remove it
>> later anyway)
>>> 
>>> This works perfectly fine if I use Solr 8.2.0 (server + client). It
>> returns Tuples in the form { “id”,”12345”, “found”:”Y”}
>>> 
>>> However, if I use Solr 8.2.0 as server and Solr 8.3.0 as client then the
>> above statement only returns the id field, but not the found field.
>>> 
>>> Questions:
>>> 1) is this expected behavior, ie Solr client 8.3.0 is in this case not
>> compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix this?
>>> 2) has the syntax for the above expression changed? If so how?
>>> 3) is this not expected behavior and I should create a Jira for it?
>>> 
>>> Thank you.
>>> Best regards
>> 


Re: Solr 8.3 Solrj streaming expressions do not return all field values

2019-11-05 Thread Joel Bernstein
I'll probably need some more details. One thing that's useful is to look at
the logs and see the underlying Solr queries that are generated. Then try
those underlying queries against the Solr index and see what comes back. If
you're not seeing the fields with the plain Solr queries then we know it's
something going on below streaming expressions. If you are seeing the
fields then it's the expressions themselves that are not handling the data
as expected.


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Nov 4, 2019 at 9:09 AM Jörn Franke  wrote:

> Most likely this issue can bei also reproduced in the admin UI for the
> streaming handler of a collection.
>
> > Am 04.11.2019 um 13:32 schrieb Jörn Franke :
> >
> > Hi,
> >
> > I use streaming expressions, e.g.
> > Sort(Select(search(...),id,if(eq(1,1),Y,N) as found), by=“field A asc”)
> > (Using export handler, sort is not really mandatory , I will remove it
> later anyway)
> >
> > This works perfectly fine if I use Solr 8.2.0 (server + client). It
> returns Tuples in the form { “id”,”12345”, “found”:”Y”}
> >
> > However, if I use Solr 8.2.0 as server and Solr 8.3.0 as client then the
> above statement only returns the id field, but not the found field.
> >
> > Questions:
> > 1) is this expected behavior, ie Solr client 8.3.0 is in this case not
> compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix this?
> > 2) has the syntax for the above expression changed? If so how?
> > 3) is this not expected behavior and I should create a Jira for it?
> >
> > Thank you.
> > Best regards
>


Re: Delete documents from the Solr index using SolrJ

2019-11-05 Thread Erick Erickson
OK, you have two options:

1.1> do NOT construct IDs with the version. Have two separate fields, id (which 
is the  in your schema and a _separate_ field called tracking (note, 
there’s already by default an _version_ field, with underscores, for optimistic 
locking, do not use that).

1.2> Index the new version of the doc with the exact same ID and a new version 
and a new “tracking” value

Solr will replace the old version with the new version based on the ID.

Second:
Before you re-add the doc, issue a delete-by-query that identifies the 
document, something like q=id:123*

_How_ you determine that there is a new version of the doc you need to index is 
outside of Solr, you have to do that yourself.

Best,
Erick

> On Nov 5, 2019, at 3:56 AM, Khare, Kushal (MIND) 
>  wrote:
> 
> Well, I cannot still completely relate to the solutions by you guys, am 
> looking into it as how could I achieve that with my application. Thanks !
> One thing, that I want to know is how to avoid full re-indexing, that is, 
> what I need is I don’t want that Solr index all the data every time some docs 
> are added, instead I want it to update it, that is index only newly added 
> docs. I hope this is possible, but how ?
> Because, currently I am using SolrJ  and it re-index complete data each time.
> 
> -Original Message-
> From: Peter Lancaster [mailto:peter.lancas...@findmypast.com]
> Sent: 04 November 2019 21:35
> To: solr-user@lucene.apache.org
> Subject: RE: Delete documents from the Solr index using SolrJ
> 
> You can delete documents in SolrJ by using deleteByQuery. Using this you can 
> delete any number of documents from your index or all your documents 
> depending on the query you specify as the parameter. How you use it is down 
> to your application.
> 
> You haven't said if your application performs a full re-index, but if so you 
> might find it useful to index a version number for your data which you 
> increment each time you perform the full indexing. Then you can increment 
> version, re-index data, delete data for old version number.
> 
> 
> -Original Message-
> From: Khare, Kushal (MIND) [mailto:kushal.kh...@mind-infotech.com]
> Sent: 04 November 2019 15:03
> To: solr-user@lucene.apache.org
> Subject: [EXTERNAL] RE: Delete documents from the Solr index using SolrJ
> 
> Thanks!
> Actually am working on a Java web application using SolrJ for Solr search.
> The users would actually be uploading/editing/deleting the docs. What have 
> done is defined a location/directory where the docs would be stored and 
> passed that location for indexing.
> So, I am quite confused how to carry on with the solution that you proposed. 
> Please guide !
> 
> -Original Message-
> From: David Hastings [mailto:hastings.recurs...@gmail.com]
> Sent: 04 November 2019 20:10
> To: solr-user@lucene.apache.org
> Subject: Re: Delete documents from the Solr index using SolrJ
> 
> delete them by query would do the trick unless im missing something 
> significant in what youre trying to do here. you can just pass in an xml
> command:
> '".$kill_query."'
> 
> On Mon, Nov 4, 2019 at 9:37 AM Khare, Kushal (MIND) < 
> kushal.kh...@mind-infotech.com> wrote:
> 
>> In my case, id won't be same.
>> Suppose, I have a doc with id : 20
>> Now, it's newer version would be either 20.1 or 22 What in this case?
>> -Original Message-
>> From: David Hastings [mailto:hastings.recurs...@gmail.com]
>> Sent: 04 November 2019 20:04
>> To: solr-user@lucene.apache.org
>> Subject: Re: Delete documents from the Solr index using SolrJ
>> 
>> when you add a new document using the same "id" value as another it
>> just over writes it
>> 
>> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) <
>> kushal.kh...@mind-infotech.com> wrote:
>> 
>>> Could you please let me know how to achieve that ?
>>> 
>>> 
>>> -Original Message-
>>> From: Jörn Franke [mailto:jornfra...@gmail.com]
>>> Sent: 04 November 2019 19:59
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Delete documents from the Solr index using SolrJ
>>> 
>>> I don’t understand why it is not possible.
>>> 
>>> However why don’t you simply overwrite the existing document instead
>>> of
>>> add+delete
>>> 
>>>> Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
>>> kushal.kh...@mind-infotech.com>:
>>>> 
>>>> Hello mates!
>>>> I want to know how we can delete the documents from the Solr index .
>>> Suppose for my system, I have a document that has been indexed, now
>&g

RE: Delete documents from the Solr index using SolrJ

2019-11-05 Thread Khare, Kushal (MIND)
Well, I cannot still completely relate to the solutions by you guys, am looking 
into it as how could I achieve that with my application. Thanks !
One thing, that I want to know is how to avoid full re-indexing, that is, what 
I need is I don’t want that Solr index all the data every time some docs are 
added, instead I want it to update it, that is index only newly added docs. I 
hope this is possible, but how ?
Because, currently I am using SolrJ  and it re-index complete data each time.

-Original Message-
From: Peter Lancaster [mailto:peter.lancas...@findmypast.com]
Sent: 04 November 2019 21:35
To: solr-user@lucene.apache.org
Subject: RE: Delete documents from the Solr index using SolrJ

You can delete documents in SolrJ by using deleteByQuery. Using this you can 
delete any number of documents from your index or all your documents depending 
on the query you specify as the parameter. How you use it is down to your 
application.

You haven't said if your application performs a full re-index, but if so you 
might find it useful to index a version number for your data which you 
increment each time you perform the full indexing. Then you can increment 
version, re-index data, delete data for old version number.


-Original Message-
From: Khare, Kushal (MIND) [mailto:kushal.kh...@mind-infotech.com]
Sent: 04 November 2019 15:03
To: solr-user@lucene.apache.org
Subject: [EXTERNAL] RE: Delete documents from the Solr index using SolrJ

Thanks!
Actually am working on a Java web application using SolrJ for Solr search.
The users would actually be uploading/editing/deleting the docs. What have done 
is defined a location/directory where the docs would be stored and passed that 
location for indexing.
So, I am quite confused how to carry on with the solution that you proposed. 
Please guide !

-Original Message-
From: David Hastings [mailto:hastings.recurs...@gmail.com]
Sent: 04 November 2019 20:10
To: solr-user@lucene.apache.org
Subject: Re: Delete documents from the Solr index using SolrJ

delete them by query would do the trick unless im missing something significant 
in what youre trying to do here. you can just pass in an xml
command:
'".$kill_query."'

On Mon, Nov 4, 2019 at 9:37 AM Khare, Kushal (MIND) < 
kushal.kh...@mind-infotech.com> wrote:

> In my case, id won't be same.
> Suppose, I have a doc with id : 20
> Now, it's newer version would be either 20.1 or 22 What in this case?
> -Original Message-
> From: David Hastings [mailto:hastings.recurs...@gmail.com]
> Sent: 04 November 2019 20:04
> To: solr-user@lucene.apache.org
> Subject: Re: Delete documents from the Solr index using SolrJ
>
> when you add a new document using the same "id" value as another it
> just over writes it
>
> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) <
> kushal.kh...@mind-infotech.com> wrote:
>
> > Could you please let me know how to achieve that ?
> >
> >
> > -Original Message-
> > From: Jörn Franke [mailto:jornfra...@gmail.com]
> > Sent: 04 November 2019 19:59
> > To: solr-user@lucene.apache.org
> > Subject: Re: Delete documents from the Solr index using SolrJ
> >
> > I don’t understand why it is not possible.
> >
> > However why don’t you simply overwrite the existing document instead
> > of
> > add+delete
> >
> > > Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
> > kushal.kh...@mind-infotech.com>:
> > >
> > > Hello mates!
> > > I want to know how we can delete the documents from the Solr index .
> > Suppose for my system, I have a document that has been indexed, now
> > its newer version is into use, so I want to use the latest one, for
> > that I want the previous one to be deleted from the index.
> > > Kindly help me a way out !
> > > I went through many articles and blogs, got the way (methods) for
> > deleting , but not actually, how to do it, because it's not possible
> > to delete every time by passing id's in around 50,000 doc system.
> > > Please suggest!
> > >
> > > 
> > >
> > > The information contained in this electronic message and any
> > > attachments
> > to this message are intended for the exclusive use of the
> > addressee(s) and may contain proprietary, confidential or privileged 
> > information.
> > If you are not the intended recipient, you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender immediately
> > and destroy all copies of this message and any attachments. WARNING:
> > Computer viruses can be transmitted via email. The recipient should
> > check this email and any attachments for the presence of viruses.
> >

Re: Delete documents from the Solr index using SolrJ

2019-11-04 Thread Erick Erickson
What Walter said. If you require displaying the version number in the UI, put 
that in a separate field.

BTW, Delete-by-query can be expensive for various arcane reasons if you’re 
using SolrCloud.

> On Nov 4, 2019, at 11:08 AM, Walter Underwood  wrote:
> 
> If it is the same document, why are you changing the ID? Use the same ID and 
> you are done. You won’t need to delete previous versions.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Nov 4, 2019, at 8:37 AM, Khare, Kushal (MIND) 
>>  wrote:
>> 
>> In my case, id won't be same.
>> Suppose, I have a doc with id : 20
>> Now, it's newer version would be either 20.1 or 22
>> What in this case?
>> -Original Message-
>> From: David Hastings [mailto:hastings.recurs...@gmail.com]
>> Sent: 04 November 2019 20:04
>> To: solr-user@lucene.apache.org
>> Subject: Re: Delete documents from the Solr index using SolrJ
>> 
>> when you add a new document using the same "id" value as another it just 
>> over writes it
>> 
>> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) < 
>> kushal.kh...@mind-infotech.com> wrote:
>> 
>>> Could you please let me know how to achieve that ?
>>> 
>>> 
>>> -Original Message-
>>> From: Jörn Franke [mailto:jornfra...@gmail.com]
>>> Sent: 04 November 2019 19:59
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Delete documents from the Solr index using SolrJ
>>> 
>>> I don’t understand why it is not possible.
>>> 
>>> However why don’t you simply overwrite the existing document instead
>>> of
>>> add+delete
>>> 
>>>> Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
>>> kushal.kh...@mind-infotech.com>:
>>>> 
>>>> Hello mates!
>>>> I want to know how we can delete the documents from the Solr index .
>>> Suppose for my system, I have a document that has been indexed, now
>>> its newer version is into use, so I want to use the latest one, for
>>> that I want the previous one to be deleted from the index.
>>>> Kindly help me a way out !
>>>> I went through many articles and blogs, got the way (methods) for
>>> deleting , but not actually, how to do it, because it's not possible
>>> to delete every time by passing id's in around 50,000 doc system.
>>>> Please suggest!
>>>> 
>>>> 
>>>> 
>>>> The information contained in this electronic message and any
>>>> attachments
>>> to this message are intended for the exclusive use of the addressee(s)
>>> and may contain proprietary, confidential or privileged information.
>>> If you are not the intended recipient, you should not disseminate,
>>> distribute or copy this e-mail. Please notify the sender immediately
>>> and destroy all copies of this message and any attachments. WARNING:
>>> Computer viruses can be transmitted via email. The recipient should
>>> check this email and any attachments for the presence of viruses. The
>>> company accepts no liability for any damage caused by any
>>> virus/trojan/worms/malicious code transmitted by this email.
>>> www.motherson.com
>>> 
>>> 
>>> 
>>> The information contained in this electronic message and any
>>> attachments to this message are intended for the exclusive use of the
>>> addressee(s) and may contain proprietary, confidential or privileged
>>> information. If you are not the intended recipient, you should not
>>> disseminate, distribute or copy this e-mail. Please notify the sender
>>> immediately and destroy all copies of this message and any
>>> attachments. WARNING: Computer viruses can be transmitted via email.
>>> The recipient should check this email and any attachments for the
>>> presence of viruses. The company accepts no liability for any damage
>>> caused by any virus/trojan/worms/malicious code transmitted by this
>>> email. www.motherson.com
>>> 
>> 
>> 
>> 
>> The information contained in this electronic message and any attachments to 
>> this message are intended for the exclusive use of the addressee(s) and may 
>> contain proprietary, confidential or privileged information. If you are not 
>> the intended recipient, you should not disseminate, distribute or copy this 
>> e-mail. Please notify the sender immediately and destroy all copies of this 
>> message and any attachments. WARNING: Computer viruses can be transmitted 
>> via email. The recipient should check this email and any attachments for the 
>> presence of viruses. The company accepts no liability for any damage caused 
>> by any virus/trojan/worms/malicious code transmitted by this email. 
>> www.motherson.com
> 



Re: Delete documents from the Solr index using SolrJ

2019-11-04 Thread Walter Underwood
If it is the same document, why are you changing the ID? Use the same ID and 
you are done. You won’t need to delete previous versions.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Nov 4, 2019, at 8:37 AM, Khare, Kushal (MIND) 
>  wrote:
> 
> In my case, id won't be same.
> Suppose, I have a doc with id : 20
> Now, it's newer version would be either 20.1 or 22
> What in this case?
> -Original Message-
> From: David Hastings [mailto:hastings.recurs...@gmail.com]
> Sent: 04 November 2019 20:04
> To: solr-user@lucene.apache.org
> Subject: Re: Delete documents from the Solr index using SolrJ
> 
> when you add a new document using the same "id" value as another it just over 
> writes it
> 
> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) < 
> kushal.kh...@mind-infotech.com> wrote:
> 
>> Could you please let me know how to achieve that ?
>> 
>> 
>> -Original Message-
>> From: Jörn Franke [mailto:jornfra...@gmail.com]
>> Sent: 04 November 2019 19:59
>> To: solr-user@lucene.apache.org
>> Subject: Re: Delete documents from the Solr index using SolrJ
>> 
>> I don’t understand why it is not possible.
>> 
>> However why don’t you simply overwrite the existing document instead
>> of
>> add+delete
>> 
>>> Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
>> kushal.kh...@mind-infotech.com>:
>>> 
>>> Hello mates!
>>> I want to know how we can delete the documents from the Solr index .
>> Suppose for my system, I have a document that has been indexed, now
>> its newer version is into use, so I want to use the latest one, for
>> that I want the previous one to be deleted from the index.
>>> Kindly help me a way out !
>>> I went through many articles and blogs, got the way (methods) for
>> deleting , but not actually, how to do it, because it's not possible
>> to delete every time by passing id's in around 50,000 doc system.
>>> Please suggest!
>>> 
>>> 
>>> 
>>> The information contained in this electronic message and any
>>> attachments
>> to this message are intended for the exclusive use of the addressee(s)
>> and may contain proprietary, confidential or privileged information.
>> If you are not the intended recipient, you should not disseminate,
>> distribute or copy this e-mail. Please notify the sender immediately
>> and destroy all copies of this message and any attachments. WARNING:
>> Computer viruses can be transmitted via email. The recipient should
>> check this email and any attachments for the presence of viruses. The
>> company accepts no liability for any damage caused by any
>> virus/trojan/worms/malicious code transmitted by this email.
>> www.motherson.com
>> 
>> 
>> 
>> The information contained in this electronic message and any
>> attachments to this message are intended for the exclusive use of the
>> addressee(s) and may contain proprietary, confidential or privileged
>> information. If you are not the intended recipient, you should not
>> disseminate, distribute or copy this e-mail. Please notify the sender
>> immediately and destroy all copies of this message and any
>> attachments. WARNING: Computer viruses can be transmitted via email.
>> The recipient should check this email and any attachments for the
>> presence of viruses. The company accepts no liability for any damage
>> caused by any virus/trojan/worms/malicious code transmitted by this
>> email. www.motherson.com
>> 
> 
> 
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments. WARNING: Computer viruses can be transmitted via 
> email. The recipient should check this email and any attachments for the 
> presence of viruses. The company accepts no liability for any damage caused 
> by any virus/trojan/worms/malicious code transmitted by this email. 
> www.motherson.com



RE: Delete documents from the Solr index using SolrJ

2019-11-04 Thread Peter Lancaster
You can delete documents in SolrJ by using deleteByQuery. Using this you can 
delete any number of documents from your index or all your documents depending 
on the query you specify as the parameter. How you use it is down to your 
application.

You haven't said if your application performs a full re-index, but if so you 
might find it useful to index a version number for your data which you 
increment each time you perform the full indexing. Then you can increment 
version, re-index data, delete data for old version number.


-Original Message-
From: Khare, Kushal (MIND) [mailto:kushal.kh...@mind-infotech.com]
Sent: 04 November 2019 15:03
To: solr-user@lucene.apache.org
Subject: [EXTERNAL] RE: Delete documents from the Solr index using SolrJ

Thanks!
Actually am working on a Java web application using SolrJ for Solr search.
The users would actually be uploading/editing/deleting the docs. What have done 
is defined a location/directory where the docs would be stored and passed that 
location for indexing.
So, I am quite confused how to carry on with the solution that you proposed. 
Please guide !

-Original Message-
From: David Hastings [mailto:hastings.recurs...@gmail.com]
Sent: 04 November 2019 20:10
To: solr-user@lucene.apache.org
Subject: Re: Delete documents from the Solr index using SolrJ

delete them by query would do the trick unless im missing something significant 
in what youre trying to do here. you can just pass in an xml
command:
'".$kill_query."'

On Mon, Nov 4, 2019 at 9:37 AM Khare, Kushal (MIND) < 
kushal.kh...@mind-infotech.com> wrote:

> In my case, id won't be same.
> Suppose, I have a doc with id : 20
> Now, it's newer version would be either 20.1 or 22 What in this case?
> -Original Message-
> From: David Hastings [mailto:hastings.recurs...@gmail.com]
> Sent: 04 November 2019 20:04
> To: solr-user@lucene.apache.org
> Subject: Re: Delete documents from the Solr index using SolrJ
>
> when you add a new document using the same "id" value as another it
> just over writes it
>
> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) <
> kushal.kh...@mind-infotech.com> wrote:
>
> > Could you please let me know how to achieve that ?
> >
> >
> > -Original Message-
> > From: Jörn Franke [mailto:jornfra...@gmail.com]
> > Sent: 04 November 2019 19:59
> > To: solr-user@lucene.apache.org
> > Subject: Re: Delete documents from the Solr index using SolrJ
> >
> > I don’t understand why it is not possible.
> >
> > However why don’t you simply overwrite the existing document instead
> > of
> > add+delete
> >
> > > Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
> > kushal.kh...@mind-infotech.com>:
> > >
> > > Hello mates!
> > > I want to know how we can delete the documents from the Solr index .
> > Suppose for my system, I have a document that has been indexed, now
> > its newer version is into use, so I want to use the latest one, for
> > that I want the previous one to be deleted from the index.
> > > Kindly help me a way out !
> > > I went through many articles and blogs, got the way (methods) for
> > deleting , but not actually, how to do it, because it's not possible
> > to delete every time by passing id's in around 50,000 doc system.
> > > Please suggest!
> > >
> > > 
> > >
> > > The information contained in this electronic message and any
> > > attachments
> > to this message are intended for the exclusive use of the
> > addressee(s) and may contain proprietary, confidential or privileged 
> > information.
> > If you are not the intended recipient, you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender immediately
> > and destroy all copies of this message and any attachments. WARNING:
> > Computer viruses can be transmitted via email. The recipient should
> > check this email and any attachments for the presence of viruses.
> > The company accepts no liability for any damage caused by any
> > virus/trojan/worms/malicious code transmitted by this email.
> > www.motherson.com
> >
> > 
> >
> > The information contained in this electronic message and any
> > attachments to this message are intended for the exclusive use of
> > the
> > addressee(s) and may contain proprietary, confidential or privileged
> > information. If you are not the intended recipient, you should not
> > disseminate, distribute or copy this e-mail. Please notify the
> > sender immediately and destroy all copies of this message and any
> >

RE: Delete documents from the Solr index using SolrJ

2019-11-04 Thread Khare, Kushal (MIND)
Thanks!
Actually am working on a Java web application using SolrJ for Solr search.
The users would actually be uploading/editing/deleting the docs. What have done 
is defined a location/directory where the docs would be stored and passed that 
location for indexing.
So, I am quite confused how to carry on with the solution that you proposed. 
Please guide !

-Original Message-
From: David Hastings [mailto:hastings.recurs...@gmail.com]
Sent: 04 November 2019 20:10
To: solr-user@lucene.apache.org
Subject: Re: Delete documents from the Solr index using SolrJ

delete them by query would do the trick unless im missing something significant 
in what youre trying to do here. you can just pass in an xml
command:
'".$kill_query."'

On Mon, Nov 4, 2019 at 9:37 AM Khare, Kushal (MIND) < 
kushal.kh...@mind-infotech.com> wrote:

> In my case, id won't be same.
> Suppose, I have a doc with id : 20
> Now, it's newer version would be either 20.1 or 22 What in this case?
> -Original Message-
> From: David Hastings [mailto:hastings.recurs...@gmail.com]
> Sent: 04 November 2019 20:04
> To: solr-user@lucene.apache.org
> Subject: Re: Delete documents from the Solr index using SolrJ
>
> when you add a new document using the same "id" value as another it
> just over writes it
>
> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) <
> kushal.kh...@mind-infotech.com> wrote:
>
> > Could you please let me know how to achieve that ?
> >
> >
> > -Original Message-
> > From: Jörn Franke [mailto:jornfra...@gmail.com]
> > Sent: 04 November 2019 19:59
> > To: solr-user@lucene.apache.org
> > Subject: Re: Delete documents from the Solr index using SolrJ
> >
> > I don’t understand why it is not possible.
> >
> > However why don’t you simply overwrite the existing document instead
> > of
> > add+delete
> >
> > > Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
> > kushal.kh...@mind-infotech.com>:
> > >
> > > Hello mates!
> > > I want to know how we can delete the documents from the Solr index .
> > Suppose for my system, I have a document that has been indexed, now
> > its newer version is into use, so I want to use the latest one, for
> > that I want the previous one to be deleted from the index.
> > > Kindly help me a way out !
> > > I went through many articles and blogs, got the way (methods) for
> > deleting , but not actually, how to do it, because it's not possible
> > to delete every time by passing id's in around 50,000 doc system.
> > > Please suggest!
> > >
> > > 
> > >
> > > The information contained in this electronic message and any
> > > attachments
> > to this message are intended for the exclusive use of the
> > addressee(s) and may contain proprietary, confidential or privileged 
> > information.
> > If you are not the intended recipient, you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender immediately
> > and destroy all copies of this message and any attachments. WARNING:
> > Computer viruses can be transmitted via email. The recipient should
> > check this email and any attachments for the presence of viruses.
> > The company accepts no liability for any damage caused by any
> > virus/trojan/worms/malicious code transmitted by this email.
> > www.motherson.com
> >
> > 
> >
> > The information contained in this electronic message and any
> > attachments to this message are intended for the exclusive use of
> > the
> > addressee(s) and may contain proprietary, confidential or privileged
> > information. If you are not the intended recipient, you should not
> > disseminate, distribute or copy this e-mail. Please notify the
> > sender immediately and destroy all copies of this message and any
> > attachments. WARNING: Computer viruses can be transmitted via email.
> > The recipient should check this email and any attachments for the
> > presence of viruses. The company accepts no liability for any damage
> > caused by any virus/trojan/worms/malicious code transmitted by this
> > email. www.motherson.com
> >
>
> 
>
> The information contained in this electronic message and any
> attachments to this message are intended for the exclusive use of the
> addressee(s) and may contain proprietary, confidential or privileged
> information. If you are not the intended recipient, you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
>

Re: Delete documents from the Solr index using SolrJ

2019-11-04 Thread David Hastings
delete them by query would do the trick unless im missing something
significant in what youre trying to do here. you can just pass in an xml
command:
'".$kill_query."'

On Mon, Nov 4, 2019 at 9:37 AM Khare, Kushal (MIND) <
kushal.kh...@mind-infotech.com> wrote:

> In my case, id won't be same.
> Suppose, I have a doc with id : 20
> Now, it's newer version would be either 20.1 or 22
> What in this case?
> -Original Message-
> From: David Hastings [mailto:hastings.recurs...@gmail.com]
> Sent: 04 November 2019 20:04
> To: solr-user@lucene.apache.org
> Subject: Re: Delete documents from the Solr index using SolrJ
>
> when you add a new document using the same "id" value as another it just
> over writes it
>
> On Mon, Nov 4, 2019 at 9:30 AM Khare, Kushal (MIND) <
> kushal.kh...@mind-infotech.com> wrote:
>
> > Could you please let me know how to achieve that ?
> >
> >
> > -Original Message-
> > From: Jörn Franke [mailto:jornfra...@gmail.com]
> > Sent: 04 November 2019 19:59
> > To: solr-user@lucene.apache.org
> > Subject: Re: Delete documents from the Solr index using SolrJ
> >
> > I don’t understand why it is not possible.
> >
> > However why don’t you simply overwrite the existing document instead
> > of
> > add+delete
> >
> > > Am 04.11.2019 um 15:12 schrieb Khare, Kushal (MIND) <
> > kushal.kh...@mind-infotech.com>:
> > >
> > > Hello mates!
> > > I want to know how we can delete the documents from the Solr index .
> > Suppose for my system, I have a document that has been indexed, now
> > its newer version is into use, so I want to use the latest one, for
> > that I want the previous one to be deleted from the index.
> > > Kindly help me a way out !
> > > I went through many articles and blogs, got the way (methods) for
> > deleting , but not actually, how to do it, because it's not possible
> > to delete every time by passing id's in around 50,000 doc system.
> > > Please suggest!
> > >
> > > 
> > >
> > > The information contained in this electronic message and any
> > > attachments
> > to this message are intended for the exclusive use of the addressee(s)
> > and may contain proprietary, confidential or privileged information.
> > If you are not the intended recipient, you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender immediately
> > and destroy all copies of this message and any attachments. WARNING:
> > Computer viruses can be transmitted via email. The recipient should
> > check this email and any attachments for the presence of viruses. The
> > company accepts no liability for any damage caused by any
> > virus/trojan/worms/malicious code transmitted by this email.
> > www.motherson.com
> >
> > 
> >
> > The information contained in this electronic message and any
> > attachments to this message are intended for the exclusive use of the
> > addressee(s) and may contain proprietary, confidential or privileged
> > information. If you are not the intended recipient, you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately and destroy all copies of this message and any
> > attachments. WARNING: Computer viruses can be transmitted via email.
> > The recipient should check this email and any attachments for the
> > presence of viruses. The company accepts no liability for any damage
> > caused by any virus/trojan/worms/malicious code transmitted by this
> > email. www.motherson.com
> >
>
> 
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus/trojan/worms/malicious code transmitted
> by this email. www.motherson.com
>


  1   2   3   4   5   6   7   8   9   10   >