Re: ant precommit fails on .adoc files

2019-10-30 Thread Cassandra Targett
On Oct 29, 2019, 10:44 AM -0500, Shawn Heisey , wrote:
>
> I tried once to build a Solr package on Windows. It didn't work,
> requiring tools that are not normally found on Windows. What I found
> for this thread seems to indicate that the source validation for the ref
> guide does not work correctly either. I would be interested in finding
> out whether or not we expect the build system to work right on Windows.
> I suspect that it is not supported.
>

Just to be clear, the error from the original poster was thrown from the 
‘validate-source-patterns’ task, which is a dependency of the ‘validate’ task 
of precommit and doesn’t use the Ref Guide tooling. It just happened to dislike 
something it found in a file that happens to be used for the Ref Guide.

Ref Guide validation, where the Ref Guide tooling is used, happens in the 
‘documentation’ task so the culprit here is more likely the Rat tooling that’s 
used to validate all source files.

Cassandra


Re: NRT vs TLOG bulk indexing performances

2019-10-30 Thread Dominique Bejean
Hi,

Thank you Erick for your response.

My documents are small. Here is a sample csv file
http://gofile.me/2dlpH/66hv2NPhJ

In the TLOG case, the CPU is not hot and not idling

on leaders :

   - 1m load average between 1.5 and 2.5 (4 cpu cores)
   - CPU % between 20% and 50% with average at 30%
   - CPU I/O wait % average : 2.5


on followers :

   - 1m load average between 0.5 and 2.0 (4 cpu cores)
   - CPU % between 5% and 35% with average at 15%
   - CPU I/O wait % average : 2.0


I made more tests. The difference is not always so big as my first tests  :

   - One shard leader only NRT or TLOG : 36 minutes
   - All NRT timing is between 23 and 27 minutes
   - All TLOG timing is between 28 and 34 minutes


I also changed the autoCommit maxtime from 15000 et 3 in order to get
the 28 minutes in TLOG mode.

With one shard and no replica, create the collection as NRT or as TLOG
gives the same indexing time and the same CPU usage.

My impression is that use TLOG replica produce 10% to 20% indexing time
increase according to autoCommit maxtime setting.

Regards

Dominique


Le ven. 25 oct. 2019 à 15:46, Erick Erickson  a
écrit :

> I’m also surpised that you see a slowdown, it’s worth investigating.
>
> Let’s take the NRT case with only a leader. I’ve seen the NRT indexing
> time increase when even a single follower was added (30-40% in this case).
> We believed that the issue was the time the leader sat waiting around for
> the follower to acknowledge receipt of the documents. Also note that these
> were very short documents.
>
> You’d still pay that price with more than one TLOG replica. But again, I’d
> expect the two times to be roughly equivalent.
>
> Indexing does not stop during index replication. That said, if you commit
> very frequently, you’ll be pushing lots of info around the network. Was
> your CPU running hot in the TLOG case or idling? If idling, then Solr isn’t
> getting fed fast enough. Perhaps there’s increased network traffic with the
> TLOG replicas replicating changed segments and that’s slowing down
> ingestion?
>
> It’d be interesting to index to NRT, leader-only and also a single TLOG
> collection.
>
>
> Best,
> Erick
>
> > On Oct 25, 2019, at 8:28 AM, Dominique Bejean 
> wrote:
> >
> > Shawn,
> >
> > So, I understand that while non leader TLOG is copying the index from
> > leader, the leader stop indexing.
> > One shot large heavy bulk indexing should be very much more impacted than
> > continus ligth indexing.
> >
> > Regards.
> >
> > Dominique
> >
> >
> > Le ven. 25 oct. 2019 à 13:54, Shawn Heisey  a
> écrit :
> >
> >> On 10/25/2019 1:16 AM, Dominique Bejean wrote:
> >>> For collection created with all replicas as NRT
> >>>
> >>> * Indexing time : 22 minutes
> >>
> >> 
> >>
> >>> For collection created with all replicas as TLOG
> >>>
> >>> * Indexing time : 34 minutes
> >>
> >> NRT indexes simultaneously on all replicas.  So when indexing is done on
> >> one, it is also done on all the others.
> >>
> >> PULL and non-leader TLOG replicas must copy the index from the leader.
> >> The leader will do the indexing and the other replicas will copy the
> >> completed index from the leader.  This takes time.  If the index is
> >> large, it can take a LOT of time, especially if the disks or network are
> >> slow.  TLOG replicas can become leader and PULL replicas cannot.
> >>
> >> What I would do personally is set two replicas for each shard to TLOG
> >> and all the rest to PULL.  When a TLOG replica is acting as leader, it
> >> will function exactly like an NRT replica.
> >>
> >>> The conclusion seems to be that by using TLOG :
> >>>
> >>> * You save CPU resources on non leaders nodes at index time
> >>> * The JVM Heap and GC are the same
> >>> * Indexing performance ares really less with TLOG
> >>
> >> Java works in such a way that it will always eventually allocate and use
> >> the entire max heap that it is allowed.  It is not always possible to
> >> determine how much heap is truly needed, though analyzing large GC logs
> >> will sometimes reveal that info.
> >>
> >> Non-leader replicas will probably require less heap if they are TLOG or
> >> PULL.  I cannot say how much less, that will be something that has to be
> >> determined.  Those replicas will also use less CPU.
> >>
> >> With newer Solr versions, you can ask SolrCloud to prefer PULL replicas
> >> for querying, so queries will be targeted to those replicas, unless they
> >> all go down, in which case it will go to non-preferred replica types.  I
> >> do not know how to do this, I only know that it is possible.
> >>
> >> Thanks,
> >> Shawn
> >>
>
>
>


Re: colStatus response not as expected with Solr 8.1.1 in a distributed deployment

2019-10-30 Thread Erick Erickson
https://issues.apache.org/jira/browse/SOLR-13882

Do watch out for browser or other caching, I often use a private window to 
avoid being fooled, I’ve had that happen more than once. If you see this 
problem and then look in the UI at 
cloud>>tree>>collections>>your_collection>>state.json and see the state of a 
replica as “down”, then it’s most probably some kind of outside-of-solr 
caching, ‘cause that value is just counted to create the output for COLSTATUS.

Also be aware that the corresponding entry in live_nodes will _NOT_ be removed 
until ZK tries to ping the Solr node and times out, so there’s a lag between 
when a node goes away un-gracefully and when that node is removed, during which 
the replica will be counted as active even if live_nodes is checked.

As far as the UI is concerned, please go ahead and search the JIRA system first 
to see if it’s been noted, otherwise go ahead and raise a JIRA. All you need is 
a sign-on. Do include which browser and version, which Solr version and a 
screenshot please.

Best,
Erick


> On Oct 30, 2019, at 10:08 AM, Elizaveta Golova  wrote:
> 
> We tried both stopping Solr gracefully, and by killing the Docker container 
> (not gracefully) and always had the same results.
> 
> 
> That's brilliant, thank you.
> Could you please send a link to the issue once it's up.
> We have our clusterStatus and colStatus json responses and our collection 
> graph showing one of the nodes being down if you'd like us to attach that to 
> the issue.
> 
> 
> Also, whenever we've come across this down node problem, we've also noticed a 
> bit of a ui issue on the cloud/nodes view where one of the node rows has its 
> column output off by one (we can attach the screenshot to the issue as well 
> if you'd like) 
> i.e. the "Node" value would be in the "Host" column, the "CPU" value would be 
> in the "Node" column ... making the "Replicas" column empty.
> 
> 
> 
> -Erick Erickson  wrote: -
> To: solr-user@lucene.apache.org
> From: Erick Erickson 
> Date: 10/30/2019 01:37PM
> Subject: Re: [EXTERNAL] colStatus response not as expected with Solr 8.1.1 in 
> a distributed deployment
> 
> 
> Exactly how did you kill the instance? If I stop Solr gracefully (bin/solr 
> stop…) it’s fine. If I do a "kill -9” on it, I see the same thing you do on 
> master.
> 
> It’s a bit tricky. When a node goes away without a chance to gracefully shut 
> down, there’s no chance to set the state in the collection’s “state.json” 
> znode. However, the node will be removed from the “live_nodes” list and a 
> replica is not truly active unless its state is “active” in the state.json 
> file _and_ the node appears in live_nodes.
> 
> CLUSTERSTATUS pretty clearly understands this, but COLSTATUS apparently 
> doesn’t.
> 
> I’ll raise a JIRA.
> 
> Thanks for letting us know
> 
> Erick
> 
>> On Oct 29, 2019, at 2:10 PM, Elizaveta Golova  wrote:
>> 
>> colStatus (and clusterStatus) from the Collections api.
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_8-5F1_collections-2Dapi.html-23colstatus&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=hYWjY91INT8BxCM7Yo3LAY4kHcOGUOO3miRla3QTVdo&m=c-eyx2cStZUbvbmTDEvuqNmXsuMXmRejU2ksFOhx9sw&s=V0GTCxFMwqrK0qtiGhBK55cwM7I2m6OVJOZL94jOqYI&e=
>>  
>> 
>> 
>> Running something like this in the browser where the live solr node is 
>> accessible on port 8983 (but points at a Docker container which is running 
>> the Solr node):
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A8983_solr_admin_collections-3Faction-3DCOLSTATUS-26collection-3Dcoll&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=hYWjY91INT8BxCM7Yo3LAY4kHcOGUOO3miRla3QTVdo&m=c-eyx2cStZUbvbmTDEvuqNmXsuMXmRejU2ksFOhx9sw&s=c2fNGwqzx7e_S5v0R_3YO4X6dys0u-PE-pUxErOXpYo&e=
>>  
>> 
>> 
>> 
>> 
>> -Erick Erickson  wrote: -
>> To: solr-user@lucene.apache.org
>> From: Erick Erickson 
>> Date: 10/29/2019 05:39PM
>> Subject: [EXTERNAL] Re: colStatus response not as expected with Solr 8.1.1 
>> in a distributed deployment
>> 
>> 
>> Uhm, what is colStatus? You need to show us _exactly_ what Solr commands 
>> you’re running for us to make any intelligent comments.
>> 
>>> On Oct 29, 2019, at 1:12 PM, Elizaveta Golova  wrote:
>>> 
>>> Hi,
>>> 
>>> We're seeing an issue with colStatus in a distributed Solr deployment.
>>> 
>>> Scenario:
>>> Collection with:
>>> - 1 zk
>>> - 2 solr nodes on different boxes (simulated using Docker containers)
>>> - replication factor 5
>>> 
>>> When we take down one node, our clusterStatus response is as expected (only 
>>> listing the live node as live, and anything on the "down" node shows the 
>>> state as down).
>>> 
>>> Our colStatus response however continues to shows every shard as being 
>>> "active" with the replica breakdown on every shard as "total" == "active", 
>>> and "down" always being zero.
>>> i.e.
>>> "shards":{
>>> "shard1":{
>>> "state":"active",
>>> "range":"8000-",
>>> "replicas":{
>>> "total":5,
>>> "active":5,
>>> "down":0

Re: Reg: Solr Segment level cache

2019-10-30 Thread Mikhail Khludnev
Hello,
Which particular cache you are talking about?

On Wed, Oct 30, 2019 at 12:19 AM lawrence antony  wrote:

> Dear Sir
>
> Do Solr support segment level cache, so that if only a single segment
> changed then only a small portion of the cached data needs to be refreshed.
>
> --
> with thanks and regards,
> lawrence antony.
>


-- 
Sincerely yours
Mikhail Khludnev


Re: colStatus response not as expected with Solr 8.1.1 in a distributed deployment

2019-10-30 Thread Elizaveta Golova
We tried both stopping Solr gracefully, and by killing the Docker container 
(not gracefully) and always had the same results.


That's brilliant, thank you.
Could you please send a link to the issue once it's up.
We have our clusterStatus and colStatus json responses and our collection graph 
showing one of the nodes being down if you'd like us to attach that to the 
issue.


Also, whenever we've come across this down node problem, we've also noticed a 
bit of a ui issue on the cloud/nodes view where one of the node rows has its 
column output off by one (we can attach the screenshot to the issue as well if 
you'd like) 
i.e. the "Node" value would be in the "Host" column, the "CPU" value would be 
in the "Node" column ... making the "Replicas" column empty.



-Erick Erickson  wrote: -
To: solr-user@lucene.apache.org
From: Erick Erickson 
Date: 10/30/2019 01:37PM
Subject: Re: [EXTERNAL] colStatus response not as expected with Solr 8.1.1 in a 
distributed deployment


Exactly how did you kill the instance? If I stop Solr gracefully (bin/solr 
stop…) it’s fine. If I do a "kill -9” on it, I see the same thing you do on 
master.

It’s a bit tricky. When a node goes away without a chance to gracefully shut 
down, there’s no chance to set the state in the collection’s “state.json” 
znode. However, the node will be removed from the “live_nodes” list and a 
replica is not truly active unless its state is “active” in the state.json file 
_and_ the node appears in live_nodes.

CLUSTERSTATUS pretty clearly understands this, but COLSTATUS apparently doesn’t.

I’ll raise a JIRA.

Thanks for letting us know

Erick

> On Oct 29, 2019, at 2:10 PM, Elizaveta Golova  wrote:
>
> colStatus (and clusterStatus) from the Collections api.
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_8-5F1_collections-2Dapi.html-23colstatus&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=hYWjY91INT8BxCM7Yo3LAY4kHcOGUOO3miRla3QTVdo&m=c-eyx2cStZUbvbmTDEvuqNmXsuMXmRejU2ksFOhx9sw&s=V0GTCxFMwqrK0qtiGhBK55cwM7I2m6OVJOZL94jOqYI&e=
>  
>
>
> Running something like this in the browser where the live solr node is 
> accessible on port 8983 (but points at a Docker container which is running 
> the Solr node):
> https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A8983_solr_admin_collections-3Faction-3DCOLSTATUS-26collection-3Dcoll&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=hYWjY91INT8BxCM7Yo3LAY4kHcOGUOO3miRla3QTVdo&m=c-eyx2cStZUbvbmTDEvuqNmXsuMXmRejU2ksFOhx9sw&s=c2fNGwqzx7e_S5v0R_3YO4X6dys0u-PE-pUxErOXpYo&e=
>  
>
>
>
>
> -Erick Erickson  wrote: -
> To: solr-user@lucene.apache.org
> From: Erick Erickson 
> Date: 10/29/2019 05:39PM
> Subject: [EXTERNAL] Re: colStatus response not as expected with Solr 8.1.1 in 
> a distributed deployment
>
>
> Uhm, what is colStatus? You need to show us _exactly_ what Solr commands 
> you’re running for us to make any intelligent comments.
>
>> On Oct 29, 2019, at 1:12 PM, Elizaveta Golova  wrote:
>>
>> Hi,
>>
>> We're seeing an issue with colStatus in a distributed Solr deployment.
>>
>> Scenario:
>> Collection with:
>> - 1 zk
>> - 2 solr nodes on different boxes (simulated using Docker containers)
>> - replication factor 5
>>
>> When we take down one node, our clusterStatus response is as expected (only 
>> listing the live node as live, and anything on the "down" node shows the 
>> state as down).
>>
>> Our colStatus response however continues to shows every shard as being 
>> "active" with the replica breakdown on every shard as "total" == "active", 
>> and "down" always being zero.
>> i.e.
>> "shards":{
>> "shard1":{
>> "state":"active",
>> "range":"8000-",
>> "replicas":{
>> "total":5,
>> "active":5,
>> "down":0,
>> "recovering":0,
>> "recovery_failed":0},
>>
>> Even though we expect the "down" count to be either 3 or 2 depending on the 
>> shard (and thus "active" being of count 2 or 3 less than it is).
>>
>> When testing this situation with both Solr nodes being on the same box, the 
>> colStatus response is as expected in regards to the replica counts.
>>
>> Thanks!Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number 
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>>
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU



Re: [EXTERNAL] colStatus response not as expected with Solr 8.1.1 in a distributed deployment

2019-10-30 Thread Erick Erickson
Exactly how did you kill the instance? If I stop Solr gracefully (bin/solr 
stop…) it’s fine. If I do a "kill -9” on it, I see the same thing you do on 
master.

It’s a bit tricky. When a node goes away without a chance to gracefully shut 
down, there’s no chance to set the state in the collection’s “state.json” 
znode. However, the node will be removed from the “live_nodes” list and a 
replica is not truly active unless its state is “active” in the state.json file 
_and_ the node appears in live_nodes.

CLUSTERSTATUS pretty clearly understands this, but COLSTATUS apparently doesn’t.

I’ll raise a JIRA.

Thanks for letting us know

Erick

> On Oct 29, 2019, at 2:10 PM, Elizaveta Golova  wrote:
> 
> colStatus (and clusterStatus) from the Collections api.
> https://lucene.apache.org/solr/guide/8_1/collections-api.html#colstatus
> 
> 
> Running something like this in the browser where the live solr node is 
> accessible on port 8983 (but points at a Docker container which is running 
> the Solr node):
> http://localhost:8983/solr/admin/collections?action=COLSTATUS&collection=coll
> 
> 
> 
> 
> -Erick Erickson  wrote: -
> To: solr-user@lucene.apache.org
> From: Erick Erickson 
> Date: 10/29/2019 05:39PM
> Subject: [EXTERNAL] Re: colStatus response not as expected with Solr 8.1.1 in 
> a distributed deployment
> 
> 
> Uhm, what is colStatus? You need to show us _exactly_ what Solr commands 
> you’re running for us to make any intelligent comments.
> 
>> On Oct 29, 2019, at 1:12 PM, Elizaveta Golova  wrote:
>> 
>> Hi,
>> 
>> We're seeing an issue with colStatus in a distributed Solr deployment.
>> 
>> Scenario:
>> Collection with:
>> - 1 zk
>> - 2 solr nodes on different boxes (simulated using Docker containers)
>> - replication factor 5
>> 
>> When we take down one node, our clusterStatus response is as expected (only 
>> listing the live node as live, and anything on the "down" node shows the 
>> state as down).
>> 
>> Our colStatus response however continues to shows every shard as being 
>> "active" with the replica breakdown on every shard as "total" == "active", 
>> and "down" always being zero.
>> i.e.
>> "shards":{
>> "shard1":{
>> "state":"active",
>> "range":"8000-",
>> "replicas":{
>> "total":5,
>> "active":5,
>> "down":0,
>> "recovering":0,
>> "recovery_failed":0},
>> 
>> Even though we expect the "down" count to be either 3 or 2 depending on the 
>> shard (and thus "active" being of count 2 or 3 less than it is).
>> 
>> When testing this situation with both Solr nodes being on the same box, the 
>> colStatus response is as expected in regards to the replica counts.
>> 
>> Thanks!Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with number 
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>> 
> 
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598. 
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> 



Reg: Solr Segment level cache

2019-10-30 Thread lawrence antony
Dear Sir

Do Solr support segment level cache, so that if only a single segment
changed then only a small portion of the cached data needs to be refreshed.

-- 
with thanks and regards,
lawrence antony.