date:20180914

RE: Faceting with EnumFieldType in 7.1

2018-09-14 Thread Peter Tyrrell

Yes.

Peter Tyrrell, MLIS
Lead Developer at Andornot
1-866-266-2525 x706 / ptyrr...@andornot.com

-Original Message-
From: Zheng Lin Edwin Yeo  
Sent: September 13, 2018 8:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Faceting with EnumFieldType in 7.1

Was the document re-indexed in Solr 7.1?

Regards,
Edwin

On Wed, 12 Sep 2018 at 23:38, Peter Tyrrell  wrote:

> I updated an older Solr 4.10 core to Solr 7.1 recently. In so doing, I 
> took an old 'gradeLevel_enum' field of type EnumField and made it an 
> EnumFieldType, since the former has been deprecated. The old core was 
> able to facet on gradeLevel_enum, but the new 7.1 core just returns no 
> facet values whatsoever for that field. Both cores return 
> gradeLevel_enum values ok when fl=gradeLevel_enum.
>
> In the schema, gradeLevel_enum is defined dynamically:
>
>  multiValued="true" />
>  enumsConfig="enumsConfig.xml" enumName="gradeLevels">
>
> This simple query fails to return any facet values in 7.1, but does 
> facet in 4.10:
>
>
> http://localhost:8983/solr/core1/select?facet.field=gradeLevel_enum
> cet=on=id,gradeLevel_enum=*:*=json
>
> Thanks for any insight.
>
> Peter Tyrrell, MLIS
> Lead Developer at Andornot
> 1-866-266-2525 x706 / 
> ptyrr...@andornot.com
>
>

Re: Explode kind of function in Solr

2018-09-14 Thread Rahul Singh

https://github.com/bazaarvoice/jolt

On Thu, Sep 13, 2018 at 9:18 AM Joel Bernstein  wrote:

> Solr Streaming Expressions allow you to do this with the cartesianProduct
> function:
>
>
> http://lucene.apache.org/solr/guide/7_4/stream-decorator-reference.html#cartesianproduct
>
> The structure of the expression is:
>
> cartesianProduct(search(...))
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Thu, Sep 13, 2018 at 6:21 AM Rushikesh Garadade <
> rushikeshgarad...@gmail.com> wrote:
>
> > Hello All,
> > Is there any functionality in solr that can convert (explode) results
> from
> > 1 document to many docuement.
> > *Example: *
> > Lets say I have doc:
> > {
> > id:1,
> > phone: [11,22,33]
> > }
> >
> > when I query to solr with id=1 I want result as below:
> > [{
> > id:1,
> > phone:11
> > },
> > {
> > id:1,
> > phone:22
> > },
> > {
> > d:1,
> > phone:33
> > }]
> >
> > Please let me know if this is possible in Solr , if Yes how?
> >
> > Thanks,
> > Rushikesh Garadade
> >
>

RE: join query in same collection

2018-09-14 Thread Vadim Ivanov

Hi,
AFAIK Solr can  join only local indexes. No matter whether you join the same 
collection or two different ones.
So, in your case shard1 will be joined to shard1  and shard2 to shard2.
Unfortunately it's hard to say from your data  which document resides in which 
shard, but you can test using =false
-- 
BR, Vadim



-Original Message-
From: Steve Pruitt [mailto:bpru...@opentext.com] 
Sent: Friday, September 14, 2018 9:22 PM
To: solr-user@lucene.apache.org
Subject: join query in same collection

I see nothing in the documentation suggesting a query with a join filter 
doesn't work when a single collection is involved.  There is the special 
deployment instructions when joining across two distinct collections, but this 
is not my case.

I have a single collection:
I have two VM's, both running Solrcloud.
My collection has 2 shards on two different nodes.  Max shards per node is set 
to 1 and replication factor is set to 1.

The join filter is: {!join from=expctr-label-memberIds 
to=expctr-id}expctr-id:4b6f7d34-a58b-3399-b077-685951d06738

When I run the query, I get back only the document with expctr-id: 
2087d22a-6157-306f-8386-8352e7d8e4d4
This looks, maybe, like it only finds a document on the replica handling the 
query.  Shouldn't it search and filter across the entire collection?

The documents:
   {
"expctr-name":"online account opening",
"expctr-description":["Journey for online customers"],
"expctr-created":1536947719132,
"expctr-to-date":154623240,
"expctr-from-date":153836640,
"expctr-id":"89ec679b-24df-3559-8428-124640c96230",
"expctr-creator":"SP",
"expctr-type":"journey",
"_version_":1611606406752894976},
  {
"expctr-name":"drop-in account opening",
"expctr-description":["Journey for dropin customers"],
"expctr-created":1536947827643,
"expctr-to-date":154623240,
"expctr-from-date":153836640,
"expctr-id":"2087d22a-6157-306f-8386-8352e7d8e4d4",
"expctr-creator":"SP",
"expctr-type":"journey",
"_version_":1611606520475156480},
  {
"expctr-name":"placeholder",
"expctr-label":"customers",
"expctr-created":1536947679984,
"expctr-to-date":0,
"expctr-from-date":0,
"expctr-id":"4b6f7d34-a58b-3399-b077-685951d06738",
"expctr-creator":"SP",
"expctr-type":"label",
"expctr-label-memberIds":["89ec679b-24df-3559-8428-124640c96230", 
"2087d22a-6157-306f-8386-8352e7d8e4d4"],
"_version_":1611606544788488192}]
  }

join query in same collection

2018-09-14 Thread Steve Pruitt

I see nothing in the documentation suggesting a query with a join filter 
doesn't work when a single collection is involved.  There is the special 
deployment instructions when joining across two distinct collections, but this 
is not my case.

I have a single collection:
I have two VM's, both running Solrcloud.
My collection has 2 shards on two different nodes.  Max shards per node is set 
to 1 and replication factor is set to 1.

The join filter is: {!join from=expctr-label-memberIds 
to=expctr-id}expctr-id:4b6f7d34-a58b-3399-b077-685951d06738

When I run the query, I get back only the document with expctr-id: 
2087d22a-6157-306f-8386-8352e7d8e4d4
This looks, maybe, like it only finds a document on the replica handling the 
query.  Shouldn't it search and filter across the entire collection?

The documents:
   {
"expctr-name":"online account opening",
"expctr-description":["Journey for online customers"],
"expctr-created":1536947719132,
"expctr-to-date":154623240,
"expctr-from-date":153836640,
"expctr-id":"89ec679b-24df-3559-8428-124640c96230",
"expctr-creator":"SP",
"expctr-type":"journey",
"_version_":1611606406752894976},
  {
"expctr-name":"drop-in account opening",
"expctr-description":["Journey for dropin customers"],
"expctr-created":1536947827643,
"expctr-to-date":154623240,
"expctr-from-date":153836640,
"expctr-id":"2087d22a-6157-306f-8386-8352e7d8e4d4",
"expctr-creator":"SP",
"expctr-type":"journey",
"_version_":1611606520475156480},
  {
"expctr-name":"placeholder",
"expctr-label":"customers",
"expctr-created":1536947679984,
"expctr-to-date":0,
"expctr-from-date":0,
"expctr-id":"4b6f7d34-a58b-3399-b077-685951d06738",
"expctr-creator":"SP",
"expctr-type":"label",
"expctr-label-memberIds":["89ec679b-24df-3559-8428-124640c96230", 
"2087d22a-6157-306f-8386-8352e7d8e4d4"],
"_version_":1611606544788488192}]
  }

Re: Sorting multi-valued fields

2018-09-14 Thread Shawn Heisey


On 9/14/2018 4:50 AM, richard.clarke wrote:

What does it mean to sort documents by a multivalued field?  If a field has
multiple values, how can this be used to sort documents?

e.g. if document 1 has a numeric field containing values 1,2,3,4,5 and
document 2 has values in the same field of 1,2,3 - which come first in the
sort order and why?


It's my understanding that Solr will refuse to sort on a multivalued 
field, returning an error.  If Solr were to make a decision to use the 
first value, or the minimum value, or the maximum value ... some users 
would think that was the wrong choice.


You can use a function query to have it sort on the min or max value of 
a field.


https://lucidworks.com/2015/09/10/minmax-on-multivalued-field/

Thanks,
Shawn

Re: solrcloud configuration: solr failed to start with multiple zookeeper servers

2018-09-14 Thread Shawn Heisey


On 9/13/2018 6:47 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:

After zk servers have started, I got the following error when I tried to start 
solr:  localhost:2182 was unexpected at this time.


Where is this error seen?  Was there more to the message?  In addition 
to knowing that, we will need to see the FULL text of all errors in the 
solr.log file.


Thanks,
Shawn

RE: solrcloud configuration: solr failed to start with multiple zookeeper servers

2018-09-14 Thread Gu, Steve (CDC/CDC OD/OADS) (CTR)

Edwin,

Solr is 7.4.0, and zookeeper is 3.4.12.

Interestingly, when I started solr with ZK_HOST="localhost:2181", I can still 
access solr via other zk_host with CloudSolrClient.

Thanks
Steve

-Original Message-
From: Zheng Lin Edwin Yeo  
Sent: Thursday, September 13, 2018 11:10 PM
To: solr-user@lucene.apache.org
Subject: Re: solrcloud configuration: solr failed to start with multiple 
zookeeper servers

Which version of Solr and ZooKeeper are you using?

What is the command that you used to start Solr?

Regards,
Edwin


On Thu, 13 Sep 2018 at 21:56, Gu, Steve (CDC/OD/OADS) (CTR) 
wrote:

> Yes, my zookeeper ensemble have 3 servers and they are all up running.
>
> -Original Message-
> From: Pure Host - Wolfgang Freudenberger 
> 
> Sent: Thursday, September 13, 2018 8:57 AM
> To: solr-user@lucene.apache.org
> Subject: Re: solrcloud configuration: solr failed to start with 
> multiple zookeeper servers
>
> did you configure the zookeeper as a quorum?
>
> Am 13.09.2018 um 14:47 schrieb Gu, Steve (CDC/OD/OADS) (CTR):
> > Hi,
> >
> > I am prototyping solrcloud and I have three zookeeper servers
> (localhost:2181,localhost:2182,localhost:2183).  I set the zkHost in 
> solr.in.cmd file as:
> >
> > set ZK_HOST="localhost:2181,localhost:2182,localhost:2183"
> >
> > After zk servers have started, I got the following error when I 
> > tried to
> start solr:  localhost:2182 was unexpected at this time.
> >
> > Solr starts ok if I set ZK_Host to only one server, such as set
> ZK_HOST="localhost:2181".
> >
> > Any suggestions?
> >
> > Thanks
> > Steve
> >
>
>
>

Re: Solr uppercase inside phrase query

2018-09-14 Thread arobinski

Did you manage to solve the problem? I have the same problem and would like
to know a solution.


Chien Nguyen wrote
> Many thank. I will try it.





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Sorting multi-valued fields

2018-09-14 Thread Mikhail Khludnev

http://people.apache.org/~mkhl/searchable-solr-guide-7-3/common-query-parameters.html#sort-parameter
In the case of primitive fields, or SortableTextFields, that are
multiValued="true" the representative value used for each doc when sorting
depends on the sort direction: The minimum value in each document is used
for ascending (asc) sorting, while the maximal value in each document is
used for descending (desc) sorting. This default behavior is equivilent to
explicitly sorting using the 2 argument field()

 function: sort=field(name,min) asc and sort=field(name,max) desc

On Fri, Sep 14, 2018 at 1:50 PM richard.clarke 
wrote:

> Hi
> What does it mean to sort documents by a multivalued field?  If a field has
> multiple values, how can this be used to sort documents?
>
> e.g. if document 1 has a numeric field containing values 1,2,3,4,5 and
> document 2 has values in the same field of 1,2,3 - which come first in the
> sort order and why?
>
> Thanks in advance.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

-- 
Sincerely yours
Mikhail Khludnev

Sorting multi-valued fields

2018-09-14 Thread richard.clarke

Hi
What does it mean to sort documents by a multivalued field?  If a field has
multiple values, how can this be used to sort documents?

e.g. if document 1 has a numeric field containing values 1,2,3,4,5 and
document 2 has values in the same field of 1,2,3 - which come first in the
sort order and why?

Thanks in advance.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: Idle Timeout while DIH indexing and implicit sharding in 7.4

2018-09-14 Thread Vadim Ivanov

Hello
Mikhail, thank you  for support.
I have already tested this case a lot to be sure what is happening under the 
hood.
As you proposed - I 've shuffled data coming from sql to solr to see how solr 
reacts:
I have 6 shards s0 ... s5
shard - is the routing field in my collection. 
(router.name=implicit=shard)
Му sql query looks like this

Select 
id

, Case when  100 > RowNumber then 's5'
else 's_' + cast(RowNumber % 4 as varchar) 
 end  as shard 
from ...

Оnly first 99 rows goes to shard s5, and all the rest data spreads evenly 
between s0 ... s3.
After 120 sec of indexing I receive IdleTimeout  from shard leader of s5 
s4 receives no data and seems do not open connection at all - so no Timeout 
occurs
s0...s3 receives data and  no Timeout occurs

When I tweak IdleTimeout  in /opt/solr-7.4.0/server/etc/jetty-http.xml It 
helps, 
But I have concerns about icreasing it from 120sec to 30 min.
Is it safe? What consequences could be?

I have noticed that IdleTimeout  in jetty 9.3.8 (coming with Solr 6.3.0) was 50 
sec
And no such behavior was observed in Solr 6.3. So default was increased 
significantly in 9.4.10 for some reason.
Maybe someone could shed some light on the reasons. What was changed in 
document routing behavior and why?
Maybe there was discussion about it that I could not find?

-- 
BR Vadim

-Original Message-
From: Mikhail Khludnev [mailto:m...@apache.org] 
Sent: Friday, September 14, 2018 12:10 PM
To: solr-user
Subject: Re: Idle Timeout while DIH indexing and implicit sharding in 7.4

Hello, Vadim.
My guess (and only guess) that bunch of updates coming into a shard causes
a heavy merge that blocks new updates in its' order. This can be verified
with logs or threaddump from the problematic node. The probable measures
are: try to shuffle updates to load other shards for a while and let
parallel merge to pack that shard. And just wait a little by increasing
timeout in jetty.
Let us know what you will encounter.

On Thu, Sep 13, 2018 at 3:54 PM Vadim Ivanov <
vadim.iva...@spb.ntk-intourist.ru> wrote:

> Hi,
> I've put some more tests on the issue and managed to find out more details.
> Time out occurs when while long indexing some documents in the beginning is
> going to one shard and then for a long time (more than 120 sec) no data at
> all is going to that shard.
> Connection to that core, opened in the beginning of indexing, goes to  idle
> timeout :( .
> If no data at all going to the shard during indexing - no timeout occurs on
> that shard.
> If Indexing finishes earlier than 120 sec - no timeout occurs on that
> shard.
> Unfortunately, in our use-case there are lot of long  indexing up to 30
> minutes with uneven shard distribution of documents.
> Any suggestion how to mitigate issue?
> --
> BR
> Vadim Ivanov
>
>
> -Original Message-
> From: Вадим Иванов [mailto:vadim.iva...@spb.ntk-intourist.ru]
> Sent: Wednesday, September 12, 2018 4:29 PM
> To: solr-user@lucene.apache.org
> Subject: Idle Timeout while DIH indexing and implicit sharding in 7.4
>
> Hello gurus,
> I am using solrCloud with DIH for indexing my data.
> Testing 7.4.0 with implicitly sharded collection  I have noticed that any
> indexing
> longer then 2 minutes always failing with many timeout records in log
> coming
> from all replicas in collection.
>
> Such as:
> x:Mycol_s_0_replica_t40 RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/12 ms
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 12/12 ms
> at
>
> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1075)
> at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:313)
> at
>
> org.apache.solr.servlet.ServletInputStreamWrapper.read(ServletInputStreamWra
> pper.java:74)
> ...
> Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
> 12/12 ms
> at
> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
> at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:180)
> at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:293)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 49)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 24)
> ... 1 more
> Suppressed: java.lang.Throwable: HttpInput failure
> at
> org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:821)
> at
>
>

Re: 20180913 - Clarification about Limitation

2018-09-14 Thread Emir Arnautović

Hi,
Here are some thought on how to resolve some of “it depends”: 
http://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html 


HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 13 Sep 2018, at 14:59, Shawn Heisey  wrote:
> 
> On 9/13/2018 2:07 AM, Rekha wrote:
>> Hi Solr Team,
>> I am new to SOLR. I need following clarification from you.
>>  How many documents can be stored in one core?   
>> Is there any limit for number of fields per document?   How 
>> many Cores can be created in on SOLR?Is there 
>> any other limitation is there based on the Disk storage size? I mean some of 
>> the database has the 10 GM limit, I have asked like that. 
>> Can we use SOLR as a database?  
> 
> You *can* use Solr as a database, but I wouldn't.  It's not designed for that 
> role.  Actual database software is better for that.  If all you need is 
> simple data storage, Solr can handle that, but as soon as you start talking 
> about complex operations like JOIN, a real database is FAR better.  Solr is a 
> search engine, and in my opinion, that's what it should be used for.
> 
> The only HARD limit that Solr has is actually a Lucene limit.  Lucene uses 
> the java "int" type for its internal document ID.  Which means that the 
> absolute maximum number of documents in one Solr core is 2147483647.  That's 
> a little over two billion.  You're likely to have scalability problems long 
> before you reach this number, though.  Also, this number includes deleted 
> documents, so it's not a good idea to actually get close to the limit.  One 
> rough rule of thumb that sometimes gets used:  If you have more than one 
> hundred million documents in a single core, you PROBABLY need to think about 
> re-designing your setup.
> 
> Using a sharded index (which SolrCloud can do a lot easier than standalone 
> Solr) removes the two billion document limitation for an index -- by 
> spreading the index across multiple Solr cores.
> 
> As for storage, you should have enough disk space available so that your 
> index data can triple in size temporarily.  This is not a joke -- that's 
> really the recommendation.  The way that Lucene operates requires that you 
> have at least *double* capacity, but there are real world situations in which 
> the index can triple in size.
> 
> Running with really big indexes means that you also need a lot of memory.  
> Good performance with Solr requires that the operating system has enough 
> memory to effectively cache the often-used parts of the index.
> 
> https://wiki.apache.org/solr/SolrPerformanceProblems#RAM
> 
> Thanks,
> Shawn
>

Re: Data Import Handler with Solr Source behind Load Balancer

2018-09-14 Thread Emir Arnautović

Hi Thomas,
Is this SolrCloud or Solr master-slave? Do you update index while indexing? Did 
you check if all your instances behind LB are in sync if you are using 
master-slave?
My guess would be that DIH is using cursors to read data from another Solr. If 
you are using multiple Solr instances behind LB there might be some diffs in 
index that results in different documents being returned for the same cursor 
mark. Is num doc and max doc the same on new instance after import?

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/

> On 12 Sep 2018, at 05:53, Zimmermann, Thomas  
> wrote:
> 
> We have a Solr v7 Instance sourcing data from a Data Import Handler with a 
> Solr data source running Solr v4. When it hits a single server in that 
> instance directly, all documents are read and written correctly to the v7. 
> When we hit the load balancer DNS entry, the resulting data import handler 
> json states that it read all the documents and skipped none, and all looks 
> fine, but the result set is missing ~20% of the documents in the v7 core. 
> This has happened multiple time on multiple environments.
> 
> Any thoughts on whether this might be a bug in the underlying DIH code? I'll 
> also pass it along to the server admins on our side for input.

Re: Idle Timeout while DIH indexing and implicit sharding in 7.4

2018-09-14 Thread Mikhail Khludnev

Hello, Vadim.
My guess (and only guess) that bunch of updates coming into a shard causes
a heavy merge that blocks new updates in its' order. This can be verified
with logs or threaddump from the problematic node. The probable measures
are: try to shuffle updates to load other shards for a while and let
parallel merge to pack that shard. And just wait a little by increasing
timeout in jetty.
Let us know what you will encounter.

On Thu, Sep 13, 2018 at 3:54 PM Vadim Ivanov <
vadim.iva...@spb.ntk-intourist.ru> wrote:

> Hi,
> I've put some more tests on the issue and managed to find out more details.
> Time out occurs when while long indexing some documents in the beginning is
> going to one shard and then for a long time (more than 120 sec) no data at
> all is going to that shard.
> Connection to that core, opened in the beginning of indexing, goes to  idle
> timeout :( .
> If no data at all going to the shard during indexing - no timeout occurs on
> that shard.
> If Indexing finishes earlier than 120 sec - no timeout occurs on that
> shard.
> Unfortunately, in our use-case there are lot of long  indexing up to 30
> minutes with uneven shard distribution of documents.
> Any suggestion how to mitigate issue?
> --
> BR
> Vadim Ivanov
>
>
> -Original Message-
> From: Вадим Иванов [mailto:vadim.iva...@spb.ntk-intourist.ru]
> Sent: Wednesday, September 12, 2018 4:29 PM
> To: solr-user@lucene.apache.org
> Subject: Idle Timeout while DIH indexing and implicit sharding in 7.4
>
> Hello gurus,
> I am using solrCloud with DIH for indexing my data.
> Testing 7.4.0 with implicitly sharded collection  I have noticed that any
> indexing
> longer then 2 minutes always failing with many timeout records in log
> coming
> from all replicas in collection.
>
> Such as:
> x:Mycol_s_0_replica_t40 RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/12 ms
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 12/12 ms
> at
>
> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1075)
> at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:313)
> at
>
> org.apache.solr.servlet.ServletInputStreamWrapper.read(ServletInputStreamWra
> pper.java:74)
> ...
> Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
> 12/12 ms
> at
> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
> at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$
> 201(ScheduledThreadPoolExecutor.java:180)
> at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Sch
> eduledThreadPoolExecutor.java:293)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
> 49)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
> 24)
> ... 1 more
> Suppressed: java.lang.Throwable: HttpInput failure
> at
> org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:821)
> at
>
> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConn
> ection.java:649)
> at
> org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:134)
>
> Resulting indexing status:
>   "statusMessages":{
> "Total Requests made to DataSource":"1",
> "Total Rows Fetched":"2828323",
> "Total Documents Processed":"2828323",
> "Total Documents Skipped":"0",
> "Full Dump Started":"2018-09-12 14:28:21",
> "":"Indexing completed. Added/Updated: 2828323 documents. Deleted 0
> documents.",
> "Committed":"2018-09-12 14:33:41",
> "Time taken":"0:5:19.507",
> "Full Import failed":"2018-09-12 14:33:41"}}
>
> Nevertheless all these documents seems indexed fine and searchable.
> If the same collection not sharded  or sharded as " compositeId"   indexing
> done without any errors.
> Type of replicas - nrt or tolg doesn't matter.
> Small Indexing (taking less than 2 minutes) run smoothly.
>
> Testing environment - 1 node, Collection with 6 shards, 1 replica for each
> shard
> Collection:
> /admin/collections?action=CREATE=Mycol
> =6
> =implicit
> =s_0,s_1,s_2,s_3,s_4,s_5
> =sf_shard
> =Mycol
> =10
> =0=1
>
>
> I have never noticed such behavior before on my prod configuration (solr
> 6.3.0)
> Seems like bug in new version, but I could not find any jira on issue.
>
> Any ideas, please...
>
> --
> BR
> Vadim Ivanov
>
>

-- 
Sincerely yours
Mikhail Khludnev

RE: Faceting with EnumFieldType in 7.1

Re: Explode kind of function in Solr

RE: join query in same collection

join query in same collection

Re: Sorting multi-valued fields

Re: solrcloud configuration: solr failed to start with multiple zookeeper servers

RE: solrcloud configuration: solr failed to start with multiple zookeeper servers

Re: Solr uppercase inside phrase query

Re: Sorting multi-valued fields

Sorting multi-valued fields

RE: Idle Timeout while DIH indexing and implicit sharding in 7.4

Re: 20180913 - Clarification about Limitation

Re: Data Import Handler with Solr Source behind Load Balancer

Re: Idle Timeout while DIH indexing and implicit sharding in 7.4

14 matches

Site Navigation

Mail list logo

Footer information