[jira] [Comment Edited] (SOLR-10987) Solr Cloud overseer node becomes unreachable. Issue Started Recently

2017-07-03 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072988#comment-16072988
 ] 

Yago Riveiro edited comment on SOLR-10987 at 7/4/17 12:14 AM:
--

If the deploy didn't change at all in the last 7 months should be an external 
cause, network issues, GC pauses due a lack of resources ...

You should post this kind of issues first in the mailing list before open a 
ticket, you will have more visibility and a faster response.






was (Author: yriveiro):
If the deploy didn't change at all in the last 7 months should be an external 
cause, network issues, GC pauses due a lack of resources ...

You should post this things first in the mailing list before open a ticket, you 
will have more visibility and a faster response.





> Solr Cloud overseer node becomes unreachable. Issue Started Recently
> 
>
> Key: SOLR-10987
> URL: https://issues.apache.org/jira/browse/SOLR-10987
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: *The following is the usage on each of the Solr Nodes:*
> Tasks: 254 total,   1 running, 252 sleeping,   0 stopped,   1 zombie
> %Cpu(s):  0.4 us,  0.3 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 
> st
> KiB Mem : 20392276 total,  4169296 free,  2917012 used, 13305968 buff/cache
> KiB Swap:  5111804 total,  5111636 free,  168 used. 16058184 avail Mem
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 21250 solr  20   0 23.599g 1.184g 228440 S   2.0  6.1  59:55.91 java
> *Solr is running on 5 machines with similar configuration:*
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):4
> On-line CPU(s) list:   0-3
> Thread(s) per core:1
> Core(s) per socket:2
> Socket(s): 2
> NUMA node(s):  1
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 62
> Model name:Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
> Stepping:  4
> CPU MHz:   2799.033
> BogoMIPS:  5600.00
> Hypervisor vendor: VMware
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0-3
>Reporter: RAHAT BHALLA
>  Labels: assistance, critical, customer, impacting, issue, need, 
> production
>
> We host a Solr Cloud of 5 Nodes for Solr Instances and 3 Zookeeper nodes to 
> maintain the cloud. We have over 70 million docs spread across 13 collections 
> with 40K more documents being added every day almost near time within spans 
> of 5 to 6 minutes.
> The System was working as expected and as required for th elast 7 months 
> until suddenly we saw the following exception and all of our instances went 
> offline. We restarted the instances and the cloud ran smoothly for three days 
> before it came crashing down again.
> *Exception It gives before it goes down is as follows:*
> 3542285 ERROR 
> (OverseerCollectionConfigSetProcessor-98221003671470081-prod-solr-node01:9080_solr-n_000106)
>  [   ] o.a.s.c.OverseerTaskProcessor
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /overseer_elect/leader
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
> at 
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
> at 
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:345)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.amILeader(OverseerTaskProcessor.java:384)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.run(OverseerTaskProcessor.java:191)
> at java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10987) Solr Cloud overseer node becomes unreachable. Issue Started Recently

2017-07-03 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072988#comment-16072988
 ] 

Yago Riveiro commented on SOLR-10987:
-

If the deploy didn't change at all in the last 7 months should be an external 
cause, network issues, GC pauses due a lack of resources ...

You should post this things first in the mailing list before open a ticket, you 
will have more visibility and a faster response.





> Solr Cloud overseer node becomes unreachable. Issue Started Recently
> 
>
> Key: SOLR-10987
> URL: https://issues.apache.org/jira/browse/SOLR-10987
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: *The following is the usage on each of the Solr Nodes:*
> Tasks: 254 total,   1 running, 252 sleeping,   0 stopped,   1 zombie
> %Cpu(s):  0.4 us,  0.3 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 
> st
> KiB Mem : 20392276 total,  4169296 free,  2917012 used, 13305968 buff/cache
> KiB Swap:  5111804 total,  5111636 free,  168 used. 16058184 avail Mem
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 21250 solr  20   0 23.599g 1.184g 228440 S   2.0  6.1  59:55.91 java
> *Solr is running on 5 machines with similar configuration:*
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):4
> On-line CPU(s) list:   0-3
> Thread(s) per core:1
> Core(s) per socket:2
> Socket(s): 2
> NUMA node(s):  1
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 62
> Model name:Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
> Stepping:  4
> CPU MHz:   2799.033
> BogoMIPS:  5600.00
> Hypervisor vendor: VMware
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0-3
>Reporter: RAHAT BHALLA
>  Labels: assistance, critical, customer, impacting, issue, need, 
> production
>
> We host a Solr Cloud of 5 Nodes for Solr Instances and 3 Zookeeper nodes to 
> maintain the cloud. We have over 70 million docs spread across 13 collections 
> with 40K more documents being added every day almost near time within spans 
> of 5 to 6 minutes.
> The System was working as expected and as required for th elast 7 months 
> until suddenly we saw the following exception and all of our instances went 
> offline. We restarted the instances and the cloud ran smoothly for three days 
> before it came crashing down again.
> *Exception It gives before it goes down is as follows:*
> 3542285 ERROR 
> (OverseerCollectionConfigSetProcessor-98221003671470081-prod-solr-node01:9080_solr-n_000106)
>  [   ] o.a.s.c.OverseerTaskProcessor
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /overseer_elect/leader
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
> at 
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
> at 
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:345)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.amILeader(OverseerTaskProcessor.java:384)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.run(OverseerTaskProcessor.java:191)
> at java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests

2017-06-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046593#comment-16046593
 ] 

Yago Riveiro commented on SOLR-9824:


Will this backported to 6.x branch?

> Documents indexed in bulk are replicated using too many HTTP requests
> -
>
> Key: SOLR-9824
> URL: https://issues.apache.org/jira/browse/SOLR-9824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.3
>Reporter: David Smiley
>Assignee: Mark Miller
> Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch, 
> SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch, 
> SOLR-9824-tflobbe.patch
>
>
> This takes awhile to explain; bear with me. While working on bulk indexing 
> small documents, I looked at the logs of my SolrCloud nodes.  I noticed that 
> shards would see an /update log message every ~6ms which is *way* too much.  
> These are requests from one shard (that isn't a leader/replica for these docs 
> but the recipient from my client) to the target shard leader (no additional 
> replicas).  One might ask why I'm not sending docs to the right shard in the 
> first place; I have a reason but it's besides the point -- there's a real 
> Solr perf problem here and this probably applies equally to 
> replicationFactor>1 situations too.  I could turn off the logs but that would 
> hide useful stuff, and it's disconcerting to me that so many short-lived HTTP 
> requests are happening, somehow at the bequest of DistributedUpdateProcessor. 
>  After lots of analysis and debugging and hair pulling, I finally figured it 
> out.  
> In SOLR-7333 ([~tpot]) introduced an optimization called 
> {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will 
> poll with a '0' timeout to the internal queue, so that it can close the 
> connection without it hanging around any longer than needed.  This part makes 
> sense to me.  Currently the only spot that has the smarts to set this flag is 
> {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the 
> last document.  So if a shard received docs in a javabin stream (but not 
> other formats) one would expect the _last_ document to have this flag.  
> There's even a test.  Docs without this flag get the default poll time; for 
> javabin it's 25ms.  Okay.
> I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send 
> javabin data in a batch, the intended efficiencies of SOLR-7333 would apply.  
> I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW 
> DistributedUpdateProcessor uses CUSC too).  CUSC uses the RequestWriter 
> (defaulting to javabin) to send each document separately without any leading 
> marker or trailing marker.  For the XML format by comparison, there is a 
> leading and trailing marker ( ... ).  Since there's no outer 
> container for the javabin unmarshalling to detect the last document, it marks 
> _every_ document as {{req.lastDocInBatch()}}!  Ouch!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10150) Solr 6.4 up to 10x slower than 6.3

2017-02-16 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870363#comment-15870363
 ] 

Yago Riveiro commented on SOLR-10150:
-

I think this is a duplicate of SOLR-10130

> Solr 6.4 up to 10x slower than 6.3
> --
>
> Key: SOLR-10150
> URL: https://issues.apache.org/jira/browse/SOLR-10150
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Fabrizio Fortino
>Priority: Critical
>  Labels: performance
> Attachments: Screen Shot 2017-02-16 at 17.31.02.png
>
>
> We noticed a considerable performance degradation (5x to 10x) using Solr 6.4 
> and huge increase of CPU utilization. Our use case is pretty simple: we have 
> a single Solr Core with around 600K small size documents. We just do lookups 
> by key (no full text searches) and use faceting capabilities.
> Using the Solr Admin Thread Dump utilities we noticed a lot of threads using 
> considerable cpuTime / userTime on codehale metrics (snapshot attached). The 
> metrics part has been drastically changed in 6.4 
> (https://issues.apache.org/jira/browse/SOLR-4735). Rolling back to Solr 6.3 
> has solved our performance problems.
> Is there any way to disable these metrics in version 6.4 ?
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7191) Improve stability and startup performance of SolrCloud with thousands of collections

2017-01-21 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832908#comment-15832908
 ] 

Yago Riveiro commented on SOLR-7191:


Restarting a node in 6.3 now takes forever ... I bumped coreLoadThreads from 4 
to 512 and restarting a node with 1500 collections takes 20 - 25 minutes. If I 
bump coreLoadThreads to 1024 or 2048 is faster, but some times replicas stay in 
a wrong state and never go up.

Other thing that I see happen now is collections created without replicas.


Shawn where can I raise maxThreads of jetty?

> Improve stability and startup performance of SolrCloud with thousands of 
> collections
> 
>
> Key: SOLR-7191
> URL: https://issues.apache.org/jira/browse/SOLR-7191
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
>Reporter: Shawn Heisey
>Assignee: Noble Paul
>  Labels: performance, scalability
> Fix For: 6.3
>
> Attachments: lots-of-zkstatereader-updates-branch_5x.log, 
> SOLR-7191.patch, SOLR-7191.patch, SOLR-7191.patch, SOLR-7191.patch, 
> SOLR-7191.patch, SOLR-7191.patch, SOLR-7191.patch
>
>
> A user on the mailing list with thousands of collections (5000 on 4.10.3, 
> 4000 on 5.0) is having severe problems with getting Solr to restart.
> I tried as hard as I could to duplicate the user setup, but I ran into many 
> problems myself even before I was able to get 4000 collections created on a 
> 5.0 example cloud setup.  Restarting Solr takes a very long time, and it is 
> not very stable once it's up and running.
> This kind of setup is very much pushing the envelope on SolrCloud performance 
> and scalability.  It doesn't help that I'm running both Solr nodes on one 
> machine (I started with 'bin/solr -e cloud') and that ZK is embedded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10015) Remove strong reference to Field Cache key (optional) so that GC can release some Field Cache entries when Solr is under memory pressure

2017-01-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832512#comment-15832512
 ] 

Yago Riveiro commented on SOLR-10015:
-

Honestly, I would prefer a degradation of performance that an OOM (where you 
will lost your cache anyway ...)

> Remove strong reference to Field Cache key (optional) so that GC can release 
> some Field Cache entries when Solr is under memory pressure
> 
>
> Key: SOLR-10015
> URL: https://issues.apache.org/jira/browse/SOLR-10015
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-10015-prototype.patch
>
>
> In current Field Cache (FC) implementation, a WeakHashMap is used, supposedly 
> designed to allow GC to release some Field Cache entries when Solr is under 
> memory pressure. 
> However, in practice, FC entry releasing seldom happens. Even worse, sometime 
> Solr goes OOM and heap dump shows large amount of memory is actually used by 
> FC. It's a sign that GC is not able to release FC entries even WeakHashMap is 
> used.
> The reason is that FC is using SegmentCoreReaders as the key to the 
> WeakHashMap. However, SegmentCoreReaders is usually strong referenced by 
> SegmentReader. A strong reference would prevent GC to release the key and 
> therefore the value. Therefore GC can't release entries in FC's WeakHashMap. 
> The JIRA is to propose a solution to remove the strong reference mentioned 
> above so that GC can release FC entries to avoid long GC pause or OOM. It 
> needs to be optional because this change is a tradeoff, trading more CPU 
> cycles for low memory footage. User can make final decision depending on 
> their use cases.
> The prototype attached use a combination of directory name and segment name 
> as key to the WeakHashMap, replacing the SegmentCoreReaders. Without change, 
> Solr doesn't release any FC entries after a GC is manually triggered. With 
> the change, FC entries are usually released after GC.
> However, I am not sure if it's the best way to solve this problem. Any 
> suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2017-01-11 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818106#comment-15818106
 ] 

Yago Riveiro commented on SOLR-8589:


Any progress on this issue?

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch, SOLR-8589.patch, SOLR-8589.patch, 
> SOLR-8589.patch, solr-8589-new-list-details-aliases.png
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9835) Create another replication mode for SolrCloud

2017-01-10 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816535#comment-15816535
 ] 

Yago Riveiro commented on SOLR-9835:


bq. how about getLiveReplicasCount() ?

If I'm reading the code and found a method called getLiveReplicasCount(), I 
expected that return the number of live replicas for a shard, and if the only 
value that can return is 1 for onlyLeaderIndexes and -1 for the rest is not a 
good name.

Something like: 
{{zkStateReader.getClusterState().getCollection(collection).getReplicationMode()}}
 that returns an enum(ONLY_LEADER_INDEXES, ALL_REPLICAS_INDEXES) or something 
like that.



> Create another replication mode for SolrCloud
> -
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE=newCollection=2=1=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9951) FileAlreadyExistsException on replication.properties

2017-01-10 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814797#comment-15814797
 ] 

Yago Riveiro commented on SOLR-9951:


This is not a duplicated of 
[SOLR-9859|https://issues.apache.org/jira/browse/SOLR-9859]?

> FileAlreadyExistsException on replication.properties
> 
>
> Key: SOLR-9951
> URL: https://issues.apache.org/jira/browse/SOLR-9951
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.3
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: master (7.0), 6.4
>
>
> Just spotted this one right after restarting two nodes. Only one node logged 
> the error. It's a single shard with two replica's. The exception was logged 
> for all three active cores:
> {code}
> java.nio.file.FileAlreadyExistsException: 
> /var/lib/solr/core_shard1_replica1/data/replication.properties
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>   at 
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
>   at java.nio.file.Files.newOutputStream(Files.java:216)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:413)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:409)
>   at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
>   at 
> org.apache.lucene.store.NRTCachingDirectory.createOutput(NRTCachingDirectory.java:157)
>   at 
> org.apache.solr.handler.IndexFetcher.logReplicationTimeAndConfFiles(IndexFetcher.java:675)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:487)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:156)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:408)
>   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8362) Add docValues support for TextField

2016-12-30 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788494#comment-15788494
 ] 

Yago Riveiro commented on SOLR-8362:


Streams only works with fields that have configured docValues. As TextField 
doesn't support docValues I had think that maybe if the field type had 
docValues the streams would work.

We want the stored value instead, your explanation makes sense :)

> Add docValues support for TextField
> ---
>
> Key: SOLR-8362
> URL: https://issues.apache.org/jira/browse/SOLR-8362
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>
> At the last lucene/solr revolution, Toke asked a question about why TextField 
> doesn't support docValues.  The short answer is because no one ever added it, 
> but the longer answer was because we would have to think through carefully 
> the _intent_ of supporting docValues for  a "tokenized" field like TextField, 
> and how to support various conflicting usecases where they could be handy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8362) Add docValues support for TextField

2016-12-30 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15788348#comment-15788348
 ] 

Yago Riveiro commented on SOLR-8362:


Without support to DocValues to text fields, reindex a collection using the 
Update Stream Decorator it's not possible also.

Streams are great to reindex data with a decent throughput. 

> Add docValues support for TextField
> ---
>
> Key: SOLR-8362
> URL: https://issues.apache.org/jira/browse/SOLR-8362
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>
> At the last lucene/solr revolution, Toke asked a question about why TextField 
> doesn't support docValues.  The short answer is because no one ever added it, 
> but the longer answer was because we would have to think through carefully 
> the _intent_ of supporting docValues for  a "tokenized" field like TextField, 
> and how to support various conflicting usecases where they could be handy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-12-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782991#comment-15782991
 ] 

Yago Riveiro commented on SOLR-9241:


Issue SOLR-9322, the RESHARD command.

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-12-27 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15780767#comment-15780767
 ] 

Yago Riveiro commented on SOLR-9241:


Any progress on this?

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9880) Add Ganglia and Graphite metrics reporters

2016-12-22 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15769533#comment-15769533
 ] 

Yago Riveiro commented on SOLR-9880:


With a file-based report I can do a wrapper to read the file, that is ok and 
for integrations I think it's the better way, plain-text is unix friendly :)

+1 to have a file-based report.

> Add Ganglia and Graphite metrics reporters
> --
>
> Key: SOLR-9880
> URL: https://issues.apache.org/jira/browse/SOLR-9880
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Minor
> Fix For: master (7.0), 6.4
>
>
> Originally SOLR-4735 provided implementations for these reporters (wrappers 
> for Dropwizard components to use {{SolrMetricReporter}} API).
> However, this functionality has been split into its own issue due to the 
> additional transitive dependencies that these reporters bring:
> * Ganglia:
> ** metrics-ganglia, ASL, 3kB
> ** gmetric4j (Ganglia RPC implementation), BSD, 29kB
> * Graphite
> ** metrics-graphite, ASL, 10kB
> ** amqp-client (RabbitMQ Java client, marked optional in pom?), ASL/MIT/GPL2, 
> 190kB
> IMHO these are not very large dependencies, and given the useful 
> functionality they provide it's worth adding them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9880) Add Ganglia and Graphite metrics reporters

2016-12-21 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767100#comment-15767100
 ] 

Yago Riveiro commented on SOLR-9880:


Any chance to add too the metrics-zabbix report?

https://github.com/hengyunabc/metrics-zabbix

> Add Ganglia and Graphite metrics reporters
> --
>
> Key: SOLR-9880
> URL: https://issues.apache.org/jira/browse/SOLR-9880
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Minor
> Fix For: master (7.0), 6.4
>
>
> Originally SOLR-4735 provided implementations for these reporters (wrappers 
> for Dropwizard components to use {{SolrMetricReporter}} API).
> However, this functionality has been split into its own issue due to the 
> additional transitive dependencies that these reporters bring:
> * Ganglia:
> ** metrics-ganglia, ASL, 3kB
> ** gmetric4j (Ganglia RPC implementation), BSD, 29kB
> * Graphite
> ** metrics-graphite, ASL, 10kB
> ** amqp-client (RabbitMQ Java client, marked optional in pom?), ASL/MIT/GPL2, 
> 190kB
> IMHO these are not very large dependencies, and given the useful 
> functionality they provide it's worth adding them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9882) ClassCastException: BasicResultContext cannot be cast to SolrDocumentList

2016-12-20 Thread Yago Riveiro (JIRA)
Yago Riveiro created SOLR-9882:
--

 Summary: ClassCastException: BasicResultContext cannot be cast to 
SolrDocumentList
 Key: SOLR-9882
 URL: https://issues.apache.org/jira/browse/SOLR-9882
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 6.3
Reporter: Yago Riveiro


After talk with [~yo...@apache.org] in the mailing list I open this Jira ticket

I'm hitting this bug in Solr 6.3.0.

null:java.lang.ClassCastException:
org.apache.solr.response.BasicResultContext cannot be cast to
org.apache.solr.common.SolrDocumentList
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3274) ZooKeeper related SolrCloud problems

2016-12-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764336#comment-15764336
 ] 

Yago Riveiro commented on SOLR-3274:


I hitting this in 6.3.0 a lot and I don't know why, my TTL for zookeeper is 
120s and I had no log into the gc log with pauses higher than 100ms

Exists some configuration to see the reason for the failure talking with 
ZooKeeper? like connection timeout or something else?

org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - Updates are 
disabled.
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.zkCheck(DistributedUpdateProcessor.java:1508)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:696)
at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:275)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:240)
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:158)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)

> ZooKeeper related SolrCloud problems
> 
>
> Key: SOLR-3274
>   

[jira] [Comment Edited] (SOLR-9580) Exception while updating statistics

2016-12-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742756#comment-15742756
 ] 

Yago Riveiro edited comment on SOLR-9580 at 12/12/16 6:43 PM:
--

I'm hitting the same bug with version 6.3.0


was (Author: yriveiro):
I'm running into the same bug with version 6.3.0

> Exception while updating statistics
> ---
>
> Key: SOLR-9580
> URL: https://issues.apache.org/jira/browse/SOLR-9580
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Chris de Kok
>
> The replication throws a warning after the 2nd time the replicaiton occurs 
> complaining about that te replication.properties already exists.
> WARN true
> IndexFetcher
> Exception while updating statistics
> java.nio.file.FileAlreadyExistsException: 
> /var/local/solr/cores/data/replication.properties
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>   at 
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
>   at java.nio.file.Files.newOutputStream(Files.java:216)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:413)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:409)
>   at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
>   at 
> org.apache.lucene.store.NRTCachingDirectory.createOutput(NRTCachingDirectory.java:157)
>   at 
> org.apache.solr.handler.IndexFetcher.logReplicationTimeAndConfFiles(IndexFetcher.java:681)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:493)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:254)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
>   at 
> org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$2(ReplicationHandler.java:1145)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9580) Exception while updating statistics

2016-12-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742756#comment-15742756
 ] 

Yago Riveiro commented on SOLR-9580:


I'm running into the same bug with version 6.3.0

> Exception while updating statistics
> ---
>
> Key: SOLR-9580
> URL: https://issues.apache.org/jira/browse/SOLR-9580
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Chris de Kok
>
> The replication throws a warning after the 2nd time the replicaiton occurs 
> complaining about that te replication.properties already exists.
> WARN true
> IndexFetcher
> Exception while updating statistics
> java.nio.file.FileAlreadyExistsException: 
> /var/local/solr/cores/data/replication.properties
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>   at 
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
>   at java.nio.file.Files.newOutputStream(Files.java:216)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:413)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:409)
>   at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
>   at 
> org.apache.lucene.store.NRTCachingDirectory.createOutput(NRTCachingDirectory.java:157)
>   at 
> org.apache.solr.handler.IndexFetcher.logReplicationTimeAndConfFiles(IndexFetcher.java:681)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:493)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:254)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
>   at 
> org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$2(ReplicationHandler.java:1145)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5894) Speed up high-cardinality facets with sparse counters

2016-12-05 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722603#comment-15722603
 ] 

Yago Riveiro commented on SOLR-5894:


Are facets with sparse counters faster that current JSON facets?

> Speed up high-cardinality facets with sparse counters
> -
>
> Key: SOLR-5894
> URL: https://issues.apache.org/jira/browse/SOLR-5894
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 4.7.1
>Reporter: Toke Eskildsen
>Priority: Minor
>  Labels: faceted-search, faceting, memory, performance
> Attachments: SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> SOLR-5894_test.zip, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> author_7M_tags_1852_logged_queries_warmed.png, 
> sparse_200docs_fc_cutoff_20140403-145412.png, 
> sparse_500docs_20140331-151918_multi.png, 
> sparse_500docs_20140331-151918_single.png, 
> sparse_5051docs_20140328-152807.png
>
>
> Multiple performance enhancements to Solr String faceting.
> * Sparse counters, switching the constant time overhead of extracting top-X 
> terms with time overhead linear to result set size
> * Counter re-use for reduced garbage collection and lower per-call overhead
> * Optional counter packing, trading speed for space
> * Improved distribution count logic, greatly improving the performance of 
> distributed faceting
> * In-segment threaded faceting
> * Regexp based white- and black-listing of facet terms
> * Heuristic faceting for large result sets
> Currently implemented for Solr 4.10. Source, detailed description and 
> directly usable WAR at http://tokee.github.io/lucene-solr/
> This project has grown beyond a simple patch and will require a fair amount 
> of co-operation with a committer to get into Solr. Splitting into smaller 
> issues is a possibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9818) Solr admin UI rapidly retries any request(s) if it loses connection with the server

2016-12-02 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15714706#comment-15714706
 ] 

Yago Riveiro commented on SOLR-9818:


This problem is critical when we use the UI to create replicas, last time I did 
the operation and the cluster was busy, the result was 23 new replicas for my 
shard ...

> Solr admin UI rapidly retries any request(s) if it loses connection with the 
> server
> ---
>
> Key: SOLR-9818
> URL: https://issues.apache.org/jira/browse/SOLR-9818
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: web gui
>Affects Versions: 6.3
>Reporter: Ere Maijala
>
> It seems that whenever the Solr admin UI loses connection with the server, be 
> the reason that the server is too slow to answer or that it's gone away 
> completely, it starts hammering the server with the previous request until it 
> gets a success response, it seems. That can be especially bad if the last 
> attempted action was something like collection reload with a SolrCloud 
> instance. The admin UI will quickly add hundreds of reload commands to 
> overseer/collection-queue-work, which may essentially cause the replicas to 
> get overloaded when they're trying to handle all the reload commands.
> I believe the UI should never retry the previous command blindly when the 
> connection is lost, but instead just ping the server until it responds again.
> Steps to reproduce:
> 1.) Fire up Solr
> 2.) Open the admin UI in browser
> 3.) Open a web console in the browser to see the requests it sends
> 4.) Stop solr
> 5.) Try an action in the admin UI
> 6.) Observe the web console in browser quickly fill up with repeats of the 
> originally attempted request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419184#comment-15419184
 ] 

Yago Riveiro commented on SOLR-8586:


Then I do not understand, how this is possible:

https://www.dropbox.com/s/a6e2wrmedop7xjv/Screenshot%202016-08-12%2018.19.22.png?dl=0

Only with 5.5.x and 6.x the heap grows to the infinite. Rolling back to 5.4 the 
amount of memory needed to become up is constant ...

With only one node running 5.5.x I have no problems, when I start a second node 
with 5.5.x they never pass the phase where they are checking replica 
synchronization.

> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419078#comment-15419078
 ] 

Yago Riveiro edited comment on SOLR-8586 at 8/12/16 4:23 PM:
-

My index has 12T of data indexed with 4.0, the _version_ field only supports 
docValues since 4.7.

To Upgrade to 5.x I ran the lucene-core-5.x over all my data,but with this new 
feature I need to re-index all my data because I don't have docValues for 
__version__ field and this feature use instead the un-inverted method that 
creates a memory struct that doesn't fit the memory of my servers ...

To be honest, this never should be done in a minor release ... this mandatory 
feature is based in a optional configuration :/

I will die in 5.4 or spend several months re-indexing data and figure out how 
to update production without downtime.  Not an easy task.




was (Author: yriveiro):
My index has 12T of data indexed with 4.0, the _version_ field only support 
docValues since 4.7.

To Upgrade to 5.x I ran the lucene-core-5.x over all my data,but with this new 
feature I need to re-index all my data because I don't have docValues for 
__version__ field and this feature use instead the un-inverted method that 
creates a memory struct that doesn't fit the memory of my servers ...

To be honest, this never should be done in a minor release ... this mandatory 
feature is based in a optional configuration :/

I will die in 5.4 or spend several months re-indexing data and figure out how 
to update production without downtime.  Not an easy task.



> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419078#comment-15419078
 ] 

Yago Riveiro commented on SOLR-8586:


My index has 12T of data indexed with 4.0, the _version_ field only support 
docValues since 4.7.

To Upgrade to 5.x I ran the lucene-core-5.x over all my data,but with this new 
feature I need to re-index all my data because I don't have docValues for 
__version__ field and this feature use instead the un-inverted method that 
creates a memory struct that doesn't fit the memory of my servers ...

To be honest, this never should be done in a minor release ... this mandatory 
feature is based in a optional configuration :/

I will die in 5.4 or spend several months re-indexing data and figure out how 
to update production without downtime.  Not an easy task.



> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-08-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418672#comment-15418672
 ] 

Yago Riveiro commented on SOLR-9241:


This feature will be released in 6.x branch or will be a 7.x feature?

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-11 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417460#comment-15417460
 ] 

Yago Riveiro commented on SOLR-8586:


Is this operation memory bound?

I'm trying to update my SolrCloud from 5.4 to 5.5.2 and I can only update one 
node, if I start another node with 5.5.2 the first dies with an OOM.

The second node never pass the phase where is checking if replicas are sync.

The SolrCloud deploy (2 nodes) has no activity at all, is a cold repository for 
archived data (around 5 Billion documents).



> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4586) Eliminate the maxBooleanClauses limit

2016-08-03 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406072#comment-15406072
 ] 

Yago Riveiro commented on SOLR-4586:


This parameter should be unlimited by default, if the user wants a limit, it's 
user responsibility to set a limit.

I hit this limit several times, and it's illogical since If I have resources to 
do a 10K boolean clause, why Can't I do it without tweak some weird param?

+1

> Eliminate the maxBooleanClauses limit
> -
>
> Key: SOLR-4586
> URL: https://issues.apache.org/jira/browse/SOLR-4586
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.2
> Environment: 4.3-SNAPSHOT 1456767M - ncindex - 2013-03-15 13:11:50
>Reporter: Shawn Heisey
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-4586.patch, SOLR-4586.patch, SOLR-4586.patch, 
> SOLR-4586.patch, SOLR-4586.patch, SOLR-4586.patch, 
> SOLR-4586_verify_maxClauses.patch
>
>
> In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
> someone asking a question about queries.  Mark Miller told me that 
> maxBooleanClauses no longer applies, that the limitation was removed from 
> Lucene sometime in the 3.x series.  The config still shows up in the example 
> even in the just-released 4.2.
> Checking through the source code, I found that the config option is parsed 
> and the value stored in objects, but does not actually seem to be used by 
> anything.  I removed every trace of it that I could find, and all tests still 
> pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6399) Implement unloadCollection in the Collections API

2016-07-26 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394033#comment-15394033
 ] 

Yago Riveiro edited comment on SOLR-6399 at 7/26/16 4:24 PM:
-

With a flag in the core.properties saying that the core is not loaded on 
startup, a new state in zookeeper saying that collection is unloaded to not 
route queries and not trigger recoveries or notify that collection is down, and 
a command to load the collection on demand It's enough.

I don't want to do a backup with a restore, I want notify the cluster to not 
load data to memory to save resources, but if necessary loading the collection 
on the fly.

Backup data involve extra space somewhere, with 1T collection you needs 1T in 
other location to backup, to say nothing of transfer data over the network ...

Backup and restore is a nice feature, but in huge clusters with a lot of data 
you not always can do it without huge amount of resources.


was (Author: yriveiro):
With a flag in the core.properties saying that the core is not loaded on 
startup, a new state in zookeeper saying that collection is unloaded to not 
route queries and not trigger recoveries or notify that collection is down, and 
a command to load the collection on demand It's enough.

I don't want to do a backup with a restore, I want notify the cluster to not 
load data to memory to save resources, but if necessary loading the collection 
on the fly.

Backup data involve extra space somewhere, with 1T collection you needs 1T in 
other location to backup, to say nothing of transfer data over the network ...

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: 6.0
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6399) Implement unloadCollection in the Collections API

2016-07-26 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394033#comment-15394033
 ] 

Yago Riveiro commented on SOLR-6399:


With a flag in the core.properties saying that the core is not loaded on 
startup, a new state in zookeeper saying that collection is unloaded to not 
route queries and not trigger recoveries or notify that collection is down, and 
a command to load the collection on demand It's enough.

I don't want to do a backup with a restore, I want notify the cluster to not 
load data to memory to save resources, but if necessary loading the collection 
on the fly.

Backup data involve extra space somewhere, with 1T collection you needs 1T in 
other location to backup, to say nothing of transfer data over the network ...

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: 6.0
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-07-21 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387792#comment-15387792
 ] 

Yago Riveiro commented on SOLR-9241:


I have one collection with 6 shards, 200G each (1.2T in total), hypothetically 
using this API I want transform it in a 12 shards collection, my concern is if 
this API will get the job done or will fail.

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-07-21 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387498#comment-15387498
 ] 

Yago Riveiro commented on SOLR-9241:


This will work with shards with 300G? The actual SPLITSHARD command never ends 
successfully in my case :(

Other important thing to be in mind:
- The operation can take 1 week if necessary, but can't crash the cluster ...
- The level of resource allocated to this task should be configurable, I don't 
know how, but something like maximum memory and threads to do the task

P.S: This feature is like ... awesome :D 

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8873) Enforce dataDir/instanceDir/ulogDir to be paths that contain only a controlled subset of characters

2016-04-04 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225253#comment-15225253
 ] 

Yago Riveiro commented on SOLR-8873:


Restrict choices without any evidence of issues is halfway to people start 
questioning (like me) why the enforcing was done. 

> Enforce dataDir/instanceDir/ulogDir to be paths that contain only a 
> controlled subset of characters
> ---
>
> Key: SOLR-8873
> URL: https://issues.apache.org/jira/browse/SOLR-8873
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tomás Fernández Löbbe
> Attachments: SOLR-8873.patch
>
>
> We currently support any valid path for dataDir/instanceDir/ulogDir. I think 
> we should prevent special characters and restrict to a subset that is 
> commonly used and tested.
> My initial proposals it to allow the Java pattern: 
> {code:java}"^[a-zA-Z0-9\\.\\ \\-_/\"':]+$"{code} but I'm open to 
> suggestions. I'm not sure if there can be issues with HDFS paths (this 
> pattern does pass the tests we currently have), or some other use case I'm 
> not considering.
> I also think our tests should use all those characters randomly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8741) Json Facet API, numBuckets not returning real number of buckets.

2016-03-30 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217835#comment-15217835
 ] 

Yago Riveiro commented on SOLR-8741:


I hit this bug too.

If I discard the docs that do not have the field that throws the NPE (q=field:* 
to fetch only docs with values) the hll doesn't throws the NPE  

> Json Facet API, numBuckets not returning real number of buckets.
> 
>
> Key: SOLR-8741
> URL: https://issues.apache.org/jira/browse/SOLR-8741
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module
>Reporter: Pablo Anzorena
>
> Hi, using the json facet api I realized that the numBuckets is wrong. It is 
> not returning the right number of buckets. I have a dimension which 
> numBuckets says it has 1340, but when retrieving all the results it brings 
> 988. 
> FYI the field is of type string.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215025#comment-15215025
 ] 

Yago Riveiro edited comment on SOLR-8642 at 3/28/16 11:09 PM:
--

This can be used to learn something that I did with linux some time ago. When 
we releases and API, we release legacy, because people will develop a codebase 
using it (this include the wrong behaviours).

If the API is broken, people like me will be in troubles. This is the reason to 
see system calls with the same name and a number in the end and are deprecated 
like 10 years later.

Improvements are good, And I believe that this is doing for a good reason, but 
without tools that allow people to migrate from older behaviours are not useful.

Solr should have an LTS version, or at least don't introduce BC in a major 
release. It's not the first time that I pass for this situation, and every time 
that I need to explain to my boss that something is broken in our current 
version but we can't upgrade because other thing is broken in next version, I 
feel his assassin instinct :p 

Annoying level to 9997



was (Author: yriveiro):
This can be use to learn something that I did with linux some time ago. When we 
releases and API, we release legacy, because people will develop a codebase 
using it (this include the wrong behaviours).

If the API is broken, people like me will be in troubles. This is the reason to 
see system calls with the same name and a number in the end and are deprecated 
like 10 years later.

Improvements are good, And I believe that this is doing for a good reason, but 
without tools that allow people to migrate from older behaviours are not useful.

Solr should have an LTS version, or at least don't introduce BC in a major 
release. It's not the first time that I pass for this situation, and every time 
that I need to explain to my boss that something is broken in our current 
version but we can't upgrade because other thing is broken in next version, I 
feel his assassin instinct :p 

Annoying level to 9997


> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE=getting+started=2=2=2=gettingstarted;
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS=json=true;
> HTTP/1.1 200 OK

[jira] [Commented] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215025#comment-15215025
 ] 

Yago Riveiro commented on SOLR-8642:


This can be use to learn something that I did with linux some time ago. When we 
releases and API, we release legacy, because people will develop a codebase 
using it (this include the wrong behaviours).

If the API is broken, people like me will be in troubles. This is the reason to 
see system calls with the same name and a number in the end and are deprecated 
like 10 years later.

Improvements are good, And I believe that this is doing for a good reason, but 
without tools that allow people to migrate from older behaviours are not useful.

Solr should have an LTS version, or at least don't introduce BC in a major 
release. It's not the first time that I pass for this situation, and every time 
that I need to explain to my boss that something is broken in our current 
version but we can't upgrade because other thing is broken in next version, I 
feel his assassin instinct :p 

Annoying level to 9997


> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE=getting+started=2=2=2=gettingstarted;
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS=json=true;
> HTTP/1.1 200 OK
> Content-Type: application/json; charset=UTF-8
> Transfer-Encoding: chunked
> {
>   "responseHeader":{
> "status":0,
> "QTime":6},
>   "cluster":{
> "collections":{
>  ...
>   "getting started":{
> "replicationFactor":"2",
> "shards":{
>   "shard1":{
> "range":"8000-",
> "state":"active",
> "replicas":{}},
>   "shard2":{
> "range":"0-7fff",
> "state":"active",
> "replicas":{}}},
> "router":{"name":"compositeId"},
> "maxShardsPerNode":"2",
> "autoAddReplicas":"false",
> "znodeVersion":1,
> "configName":"gettingstarted"},
> "live_nodes":["127.0.1.1:8983_solr",
>   "127.0.1.1:7574_solr"]}}
> {code}
> The commands/responses above suggest that Solr creates the collection without 
> checking the name.  It then goes on to create the cores for the collection, 
> which fails and returns the error seen 

[jira] [Commented] (SOLR-8110) Start enforcing field naming recomendations in next X.0 release?

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214992#comment-15214992
 ] 

Yago Riveiro commented on SOLR-8110:


My bad. The issue was pointed in the IRC as the  actual place of discussion 
about name enforcing 

This issue is about the schema fields and not the one that enforce collection 
name.

> Start enforcing field naming recomendations in next X.0 release?
> 
>
> Key: SOLR-8110
> URL: https://issues.apache.org/jira/browse/SOLR-8110
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-8110.patch, SOLR-8110.patch
>
>
> For a very long time now, Solr has made the following "recommendation" 
> regarding field naming conventions...
> bq. field names should consist of alphanumeric or underscore characters only 
> and not start with a digit.  This is not currently strictly enforced, but 
> other field names will not have first class support from all components and 
> back compatibility is not guaranteed.  ...
> I'm opening this issue to track discussion about if/how we should start 
> enforcing this as a rule instead (instead of just a "recommendation") in our 
> next/future X.0 (ie: major) release.
> The goals of doing so being:
> * simplify some existing code/apis that currently use hueristics to deal with 
> lists of field and produce strange errors when the huerstic fails (example: 
> ReturnFields.add)
> * reduce confusion/pain for new users who might start out unaware of the 
> recommended conventions and then only later encountering a situation where 
> their field names are not supported by some feature and get frustrated 
> because they have to change their schema, reindex, update index/query client 
> expectations, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8110) Start enforcing field naming recomendations in next X.0 release?

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214325#comment-15214325
 ] 

Yago Riveiro edited comment on SOLR-8110 at 3/28/16 3:39 PM:
-

This enforcing shouldn't happen without an API to rename collections ... and 
don't not forget that there are people with indexes with terabytes of data that 
can't do a full re-index


was (Author: yriveiro):
This enforcing shouldn't happen without an API to rename collections ... and 
don't not forget that there is people with indexes with terabytes of data that 
can't do a full re-index

> Start enforcing field naming recomendations in next X.0 release?
> 
>
> Key: SOLR-8110
> URL: https://issues.apache.org/jira/browse/SOLR-8110
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-8110.patch, SOLR-8110.patch
>
>
> For a very long time now, Solr has made the following "recommendation" 
> regarding field naming conventions...
> bq. field names should consist of alphanumeric or underscore characters only 
> and not start with a digit.  This is not currently strictly enforced, but 
> other field names will not have first class support from all components and 
> back compatibility is not guaranteed.  ...
> I'm opening this issue to track discussion about if/how we should start 
> enforcing this as a rule instead (instead of just a "recommendation") in our 
> next/future X.0 (ie: major) release.
> The goals of doing so being:
> * simplify some existing code/apis that currently use hueristics to deal with 
> lists of field and produce strange errors when the huerstic fails (example: 
> ReturnFields.add)
> * reduce confusion/pain for new users who might start out unaware of the 
> recommended conventions and then only later encountering a situation where 
> their field names are not supported by some feature and get frustrated 
> because they have to change their schema, reindex, update index/query client 
> expectations, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8110) Start enforcing field naming recomendations in next X.0 release?

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214325#comment-15214325
 ] 

Yago Riveiro commented on SOLR-8110:


This enforcing shouldn't happen without an API to rename collections ... and 
don't not forget that there is people with indexes with terabytes of data that 
can't do a full re-index

> Start enforcing field naming recomendations in next X.0 release?
> 
>
> Key: SOLR-8110
> URL: https://issues.apache.org/jira/browse/SOLR-8110
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-8110.patch, SOLR-8110.patch
>
>
> For a very long time now, Solr has made the following "recommendation" 
> regarding field naming conventions...
> bq. field names should consist of alphanumeric or underscore characters only 
> and not start with a digit.  This is not currently strictly enforced, but 
> other field names will not have first class support from all components and 
> back compatibility is not guaranteed.  ...
> I'm opening this issue to track discussion about if/how we should start 
> enforcing this as a rule instead (instead of just a "recommendation") in our 
> next/future X.0 (ie: major) release.
> The goals of doing so being:
> * simplify some existing code/apis that currently use hueristics to deal with 
> lists of field and produce strange errors when the huerstic fails (example: 
> ReturnFields.add)
> * reduce confusion/pain for new users who might start out unaware of the 
> recommended conventions and then only later encountering a situation where 
> their field names are not supported by some feature and get frustrated 
> because they have to change their schema, reindex, update index/query client 
> expectations, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214269#comment-15214269
 ] 

Yago Riveiro edited comment on SOLR-8642 at 3/28/16 3:11 PM:
-

Hi,

I can't believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This kind of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .


was (Author: yriveiro):
Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This kind of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .

> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE=getting+started=2=2=2=gettingstarted;
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS=json=true;
> HTTP/1.1 200 OK
> Content-Type: application/json; 

[jira] [Comment Edited] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214269#comment-15214269
 ] 

Yago Riveiro edited comment on SOLR-8642 at 3/28/16 3:05 PM:
-

Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This kind of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .


was (Author: yriveiro):
Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This can of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .

> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE=getting+started=2=2=2=gettingstarted;
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS=json=true;
> HTTP/1.1 200 OK
> Content-Type: application/json; charset=UTF-8
> 

[jira] [Commented] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214269#comment-15214269
 ] 

Yago Riveiro commented on SOLR-8642:


Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This can of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .

> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE=getting+started=2=2=2=gettingstarted;
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS=json=true;
> HTTP/1.1 200 OK
> Content-Type: application/json; charset=UTF-8
> Transfer-Encoding: chunked
> {
>   "responseHeader":{
> "status":0,
> "QTime":6},
>   "cluster":{
> "collections":{
>  ...
>   "getting started":{
> "replicationFactor":"2",
> "shards":{
>   "shard1":{
> "range":"8000-",
> "state":"active",
> "replicas":{}},
>   "shard2":{
> "range":"0-7fff",
> "state":"active",
> "replicas":{}}},
> "router":{"name":"compositeId"},
> "maxShardsPerNode":"2",
> "autoAddReplicas":"false",
> "znodeVersion":1,
> "configName":"gettingstarted"},
> "live_nodes":["127.0.1.1:8983_solr",
>   "127.0.1.1:7574_solr"]}}
> {code}
> The commands/responses above suggest that Solr creates the collection without 
> checking the name.  It then goes on to create the cores for the collection, 
> which fails and returns the error seen above.
> I verified this 

[jira] [Commented] (SOLR-7452) json facet api returning inconsistent counts in cloud set up

2016-03-15 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195024#comment-15195024
 ] 

Yago Riveiro commented on SOLR-7452:


[~yo...@apache.org] This feature is important to have accurate information.

Any chance to have this before 6.0 be released?

P.S: The documentation doesn't inform about this situation, that is a big 
downside in some scenarios.

> json facet api returning inconsistent counts in cloud set up
> 
>
> Key: SOLR-7452
> URL: https://issues.apache.org/jira/browse/SOLR-7452
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module
>Affects Versions: 5.1
>Reporter: Vamsi Krishna D
>  Labels: count, facet, sort
> Fix For: 5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> While using the newly added feature of json term facet api 
> (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent 
> returns of counts of faceted value ( Note I am running on a cloud mode of 
> solr). For example consider that i have txns_id(unique field or key), 
> consumer_number and amount. Now for a 10 million such records , lets say i 
> query for 
> q=*:*=0&
>  json.facet={
>biskatoo:{
>type : terms,
>field : consumer_number,
>limit : 20,
>   sort : {y:desc},
>   numBuckets : true,
>   facet:{
>y : "sum(amount)"
>}
>}
>  }
> the results are as follows ( some are omitted ):
> "facets":{
> "count":6641277,
> "biskatoo":{
>   "numBuckets":3112708,
>   "buckets":[{
>   "val":"surya",
>   "count":4,
>   "y":2.264506},
>   {
>   "val":"raghu",
>   "COUNT":3,   // capitalised for recognition 
>   "y":1.8},
> {
>   "val":"malli",
>   "count":4,
>   "y":1.78}]}}}
> but if i restrict the query to 
> q=consumer_number:raghu=0&
>  json.facet={
>biskatoo:{
>type : terms,
>field : consumer_number,
>limit : 20,
>   sort : {y:desc},
>   numBuckets : true,
>   facet:{
>y : "sum(amount)"
>}
>}
>  }
> i get :
>   "facets":{
> "count":4,
> "biskatoo":{
>   "numBuckets":1,
>   "buckets":[{
>   "val":"raghu",
>   "COUNT":4,
>   "y":2429708.24}]}}}
> One can see the count results are inconsistent ( and I found many occasions 
> of inconsistencies).
> I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but 
> still the issue seems not resolved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6399) Implement unloadCollection in the Collections API

2016-03-08 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184986#comment-15184986
 ] 

Yago Riveiro edited comment on SOLR-6399 at 3/8/16 2:44 PM:


Any chance of this issue see the light of day?

In setups with thousand of collections this feature is very useful to not use 
resources in collections without activity.


was (Author: yriveiro):
Any chance of this issue see the light of day?

In setups with thousand of collections this feature is very useful to not use 
resources in collections with activity.

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.9
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: master
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6399) Implement unloadCollection in the Collections API

2016-03-08 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184986#comment-15184986
 ] 

Yago Riveiro commented on SOLR-6399:


Any chance of this issue see the light of day?

In setups with thousand of collections this feature is very useful to not use 
resources in collections with activity.

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.9
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: master
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8635) Shards don't propagate the document update correctly

2016-02-03 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130531#comment-15130531
 ] 

Yago Riveiro commented on SOLR-8635:


In your solrconfig you have:
{quote}
LUCENE_40 
{quote}
and should be 
{quote}
LUCENE_5.4.1{quote}

> Shards don't propagate the document update correctly
> 
>
> Key: SOLR-8635
> URL: https://issues.apache.org/jira/browse/SOLR-8635
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.4.1
> Environment: - Red Hat Enterprise Linux Server release 5.6 (Tikanga)
> - Oracle jdk1.7.0_79
> - Apache Solr 5.4.1
> - Apache Zookeeper 3.4.6
>Reporter: Alberto Ferrini
>  Labels: shard, solrcloud, update
> Attachments: schema.xml, solrconfig.xml, zoo.cfg
>
>
> I created a SolrCloud infrastructure with 2 shards and 1 leader and 2 
> reaplicas for each shard: Zookeeper is deployed in an external ensemble.
> When I add a new document, or when I delete an existing document, all works 
> correctly.
> But when I update an existent document, the field value is not correctly 
> propagated between the shards, with inconsistency of the index (the query 
> result for that document shows sometimes the new value, sometimes the old 
> value: I see the value because the field is stored).
> Example for the reproduction of the issue:
> - Create document with id "List" and field PATH with value 1 on shard *1*.
> - Query for document (ID:List) -> All OK
> - Create document with id "List" and field PATH with value 2 on shard *2* 
> (document update).
> - Query for document (ID:List) -> Issue: sometimes answers with value 1, 
> sometimes answers with value 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2016-01-26 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117356#comment-15117356
 ] 

Yago Riveiro commented on SOLR-8589:


How is exposed aliases list in SOLR-4968?

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch, solr-8589-new-list-details-aliases.png
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2016-01-26 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117443#comment-15117443
 ] 

Yago Riveiro commented on SOLR-8589:


[~elyograg], as I said before, aliases are related with collections, a new 
command doesn't make sense. An alias of a collection is a virtual collection, 
therefore should be part of LIST command. We share the same opinion.

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch, solr-8589-new-list-details-aliases.png
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2016-01-23 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113817#comment-15113817
 ] 

Yago Riveiro commented on SOLR-8589:


Aliases are related with collections, if I need to do a HTTP call to get the 
aliases from clusterstatus API I will need to parse a huge structure (with 
thousand of collections) with a lot of noise only to know the aliases ...

IMHO the collection API should return this info if requested on a command LIST. 
Something like aliases=true

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-8257) DELETEREPLICA command shouldn't delete de last replica of a shard

2015-11-09 Thread Yago Riveiro (JIRA)
Yago Riveiro created SOLR-8257:
--

 Summary: DELETEREPLICA command shouldn't delete de last replica of 
a shard
 Key: SOLR-8257
 URL: https://issues.apache.org/jira/browse/SOLR-8257
 Project: Solr
  Issue Type: Bug
Reporter: Yago Riveiro
Priority: Minor


The DELETEREPLICA command shouldn't remove the last replica of a shard.

The original thread in the mailing list 
http://lucene.472066.n3.nabble.com/DELETEREPLICA-command-shouldn-t-delete-de-last-replica-of-a-shard-td4239054.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6583) Resuming connection with ZooKeeper causes log replay

2014-12-02 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231304#comment-14231304
 ] 

Yago Riveiro commented on SOLR-6583:


This is happening in Solr 4.6.1 too.

{code}
ERROR - app2 - 2014-12-01 21:30:42.820; org.apache.solr.update.UpdateLog; Error 
inspecting tlog 
tlog{file=/solr/node/collections/collection1_shard2_replica1/data/tlog/tlog.0001284
 refcount=2}
{code}

 Resuming connection with ZooKeeper causes log replay
 

 Key: SOLR-6583
 URL: https://issues.apache.org/jira/browse/SOLR-6583
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.10.1
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, Trunk


 If a node is partitioned from ZooKeeper for an extended period of time then 
 upon resuming connection, the node re-registers itself causing 
 recoverFromLog() method to be executed which fails with the following 
 exception:
 {code}
 8091124 [Thread-71] ERROR org.apache.solr.update.UpdateLog  – Error 
 inspecting tlog 
 tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0009869
  refcount=2}
 java.nio.channels.ClosedChannelException
 at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
 at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
 at 
 org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784)
 at 
 org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
 at 
 org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
 at java.io.InputStream.read(InputStream.java:101)
 at 
 org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218)
 at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800)
 at org.apache.solr.cloud.ZkController.register(ZkController.java:834)
 at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271)
 at 
 org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)
 8091125 [Thread-71] ERROR org.apache.solr.update.UpdateLog  – Error 
 inspecting tlog 
 tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0009870
  refcount=2}
 java.nio.channels.ClosedChannelException
 at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
 at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
 at 
 org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784)
 at 
 org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
 at 
 org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
 at java.io.InputStream.read(InputStream.java:101)
 at 
 org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218)
 at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800)
 at org.apache.solr.cloud.ZkController.register(ZkController.java:834)
 at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271)
 at 
 org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)
 {code}
 This is because the recoverFromLog uses transaction log references that were 
 collected at startup and are no longer valid.
 We shouldn't even be running recoverFromLog code for ZK re-connect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-06-26 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044678#comment-14044678
 ] 

Yago Riveiro commented on SOLR-5473:


[~elyograg] I can wrong, but I think that the 1MB limit is for znode and not 
for ZK database.



 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038810#comment-14038810
 ] 

Yago Riveiro commented on SOLR-4793:


I think that version 4.8 updates zookeeper version to 3.4.6

If the workaround doesn't work then is serious issue if you have a large number 
of collections and replicas because all metadata about the cluster is into 
clusterstate.json file.

[~ecario], How you notice it that the workaround doesn't work? Have you any 
logs or something? and last question, do you upgrade Solr from 4.7 to 4.8 or is 
a fresh install?




 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038874#comment-14038874
 ] 

Yago Riveiro commented on SOLR-4793:


Elaine can you paste the configuration for tomcat and zookeeper that you have 
for the jute.maxbuffer?

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038898#comment-14038898
 ] 

Yago Riveiro commented on SOLR-4793:


About tocamt's configuration I have the same configuration.

In the case of Zookeeper I have all custom configurations into a file named 
zookeeper-env.sh located into bin/conf folder with this content:

{code}
#!/usr/bin/env bash

ZOO_ENV=-Djute.maxbuffer= 5000
{code}

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038918#comment-14038918
 ] 

Yago Riveiro commented on SOLR-4793:


Indeed, after dive into the zkEnv file, I realised that if the zookeeper-env.sh 
exists, the zookeeper append the configurations to the init command.

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038970#comment-14038970
 ] 

Yago Riveiro commented on SOLR-4793:


Elaine now is easier to do the debug, you know where the problem is :).

Note: I'm using the 3.4.5 version of zookeeper, I don't know if the zkServer.sh 
was changed 

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039198#comment-14039198
 ] 

Yago Riveiro commented on SOLR-4793:


it's probably that I was tweaked the zkServer file a bit ... :P

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5788) Document update in case of error doesn't return the error message correctly

2014-02-27 Thread Yago Riveiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5788:
---

Summary: Document update in case of error doesn't return the error message 
correctly  (was: Document update in case of error doesn't returns the error 
message correctly)

 Document update in case of error doesn't return the error message correctly
 ---

 Key: SOLR-5788
 URL: https://issues.apache.org/jira/browse/SOLR-5788
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro

 I found a issue when updating a document.
 If for any reason the update can't be done, example: the schema doesn't match 
 with the incoming doc; the error raise to the user is something like:
 {noformat}
 curl 'http://localhost:8983/solr/collection1/update?commit=true' 
 --data-binary @doc.json -H 'Content-type:application/json'
 {responseHeader:{status:400,QTime:52},error:{msg:Bad 
 Request\n\n\n\nrequest: 
 http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2Fwt=javabinversion=2,code:400}}
 {noformat}
 In case that the update was done on the leader, the error message is (IMHO) 
 the correct and with valuable info:
 {noformat}
 curl 'http://localhost:8983/solr/collection1/update?commit=true' 
 --data-binary @doc.json -H 'Content-type:application/json'
 {responseHeader:{status:400,QTime:19},error:{msg:ERROR: 
 [doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input 
 string: \Direct\,code:400}}
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5788) Document update in case of error doesn't returns the error message correctly

2014-02-27 Thread Yago Riveiro (JIRA)
Yago Riveiro created SOLR-5788:
--

 Summary: Document update in case of error doesn't returns the 
error message correctly
 Key: SOLR-5788
 URL: https://issues.apache.org/jira/browse/SOLR-5788
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro


I found a issue when updating a document.

If for any reason the update can't be done, example: the schema doesn't match 
with the incoming doc; the error raise to the user is something like:

{noformat}
curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary 
@doc.json -H 'Content-type:application/json'
{responseHeader:{status:400,QTime:52},error:{msg:Bad 
Request\n\n\n\nrequest: 
http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2Fwt=javabinversion=2,code:400}}
{noformat}

In case that the update was done on the leader, the error message is (IMHO) the 
correct and with valuable info:

{noformat}
curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary 
@doc.json -H 'Content-type:application/json'
{responseHeader:{status:400,QTime:19},error:{msg:ERROR: 
[doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input string: 
\Direct\,code:400}}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5732) NPE trying get stats with statsComponent

2014-02-14 Thread Yago Riveiro (JIRA)
Yago Riveiro created SOLR-5732:
--

 Summary: NPE trying get stats with statsComponent
 Key: SOLR-5732
 URL: https://issues.apache.org/jira/browse/SOLR-5732
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro


I'm trying to get stats over a field with type solr.TrieDateField

The field is configurated as:
{noformat}
 fieldtype name=tdate  class=solr.TrieDateField precisionStep=8 
positionIncrementGap=0 sortMissingLast=true omitNorms=true 
omitPositions=true docValuesFormat=Disk/
{noformat}

Triying to run this query: 
{noformat}
q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]stats=truestats.field=datetime
{noformat}

I have this exception: http://apaste.info/dWL0

A printscreen of the field with the flags  
[here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5732) NPE trying get stats with statsComponent

2014-02-14 Thread Yago Riveiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5732:
---

Description: 
I'm trying to get stats over a field with type solr.TrieDateField

The field is configurated as:
{noformat}
 fieldtype name=tdate  class=solr.TrieDateField precisionStep=8 
positionIncrementGap=0 sortMissingLast=true omitNorms=true 
omitPositions=true docValuesFormat=Disk/
{noformat}

Triying to run this query: 
{noformat}
q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]stats=truestats.field=datetime
{noformat}

I have this exception: http://apaste.info/dWL0

A printscreen of the field with the flags  
[here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]

I can run a facet search over the field without any problem.

  was:
I'm trying to get stats over a field with type solr.TrieDateField

The field is configurated as:
{noformat}
 fieldtype name=tdate  class=solr.TrieDateField precisionStep=8 
positionIncrementGap=0 sortMissingLast=true omitNorms=true 
omitPositions=true docValuesFormat=Disk/
{noformat}

Triying to run this query: 
{noformat}
q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]stats=truestats.field=datetime
{noformat}

I have this exception: http://apaste.info/dWL0

A printscreen of the field with the flags  
[here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]


 NPE trying get stats with statsComponent
 

 Key: SOLR-5732
 URL: https://issues.apache.org/jira/browse/SOLR-5732
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro

 I'm trying to get stats over a field with type solr.TrieDateField
 The field is configurated as:
 {noformat}
  fieldtype name=tdate  class=solr.TrieDateField precisionStep=8 
 positionIncrementGap=0 sortMissingLast=true omitNorms=true 
 omitPositions=true docValuesFormat=Disk/
 {noformat}
 Triying to run this query: 
 {noformat}
 q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]stats=truestats.field=datetime
 {noformat}
 I have this exception: http://apaste.info/dWL0
 A printscreen of the field with the flags  
 [here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]
 I can run a facet search over the field without any problem.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5724) Two node, one shard solr instance intermittently going offline

2014-02-13 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900513#comment-13900513
 ] 

Yago Riveiro commented on SOLR-5724:


I have this issue too, the only way that I found to recover from this was 
restart the nodes.

 Two node, one shard solr instance intermittently going offline 
 ---

 Key: SOLR-5724
 URL: https://issues.apache.org/jira/browse/SOLR-5724
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
 Environment: Ubuntu 12.04.3 LTS, 64 bit,  java version 1.6.0_45
 Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
 Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)
Reporter: Joseph Duchesne

 One server is stuck in state recovering while the other is stuck in state 
 down. After waiting 45 minutes or so for the cluster to recover, the 
 statuses were the same. 
 Log messages on the recovering server: (Just the individual errors for 
 brevity, I can provide full stack traces if that is helpful)
 {quote}
 We are not the leader
 ClusterState says we are the leader, but locally we don't think so
 cancelElection did not find election node to remove
 We are not the leader
 No registered leader was found, collection:listsC slice:shard1
 No registered leader was found, collection:listsC slice:shard1
 {quote}
 On the down server at the same timeframe:
 {quote}
 org.apache.solr.common.SolrException; forwarding update to 
 http://10.0.2.48:8983/solr/listsC/ failed - retrying ... retries: 3
 org.apache.solr.update.StreamingSolrServers$1; error
 Error while trying to recover. 
 core=listsC:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
  We are not the leader
 Recovery failed - trying again... (0) core=listsC
 Stopping recovery for zkNodeName=core_node2core=listsC
 org.apache.solr.update.StreamingSolrServers$1; error
 org.apache.solr.common.SolrException: Service Unavailable
 {quote}
 I am not sure what is causing this, however it has happened a 3 times in the 
 past week. If there are any additional logs I can provide, or if there is 
 anything I can do to try to figure this out myself I will gladly try to help. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5670) _version_ either indexed OR docvalue

2014-01-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883958#comment-13883958
 ] 

Yago Riveiro commented on SOLR-5670:


should be the wiki ref updated with this info?

This is a minor change, but when we are creating the schema, if we will 
leverage the docvalues feature, this kind of configurations can matter.

 _version_ either indexed OR docvalue
 

 Key: SOLR-5670
 URL: https://issues.apache.org/jira/browse/SOLR-5670
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: solr, solrcloud, version
 Fix For: 5.0, 4.7

 Attachments: SOLR-5670.patch, SOLR-5670.patch


 As far as I can see there is no good reason to require that _version_ field 
 has to be indexed if it is docvalued. So I guess it will be ok with a rule 
 saying _version_ has to be either indexed or docvalue (allowed to be both).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5670) _version_ either indexed OR docvalue

2014-01-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884076#comment-13884076
 ] 

Yago Riveiro commented on SOLR-5670:


The Solr guide are here [Solr Reference 
Guide|https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide]

 _version_ either indexed OR docvalue
 

 Key: SOLR-5670
 URL: https://issues.apache.org/jira/browse/SOLR-5670
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: solr, solrcloud, version
 Fix For: 5.0, 4.7

 Attachments: SOLR-5670.patch, SOLR-5670.patch


 As far as I can see there is no good reason to require that _version_ field 
 has to be indexed if it is docvalued. So I guess it will be ok with a rule 
 saying _version_ has to be either indexed or docvalue (allowed to be both).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5670) _version_ either indexed OR docvalue

2014-01-28 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884076#comment-13884076
 ] 

Yago Riveiro edited comment on SOLR-5670 at 1/28/14 12:45 PM:
--

The Solr guide is here [Solr Reference 
Guide|https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide]


was (Author: yriveiro):
The Solr guide are here [Solr Reference 
Guide|https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide]

 _version_ either indexed OR docvalue
 

 Key: SOLR-5670
 URL: https://issues.apache.org/jira/browse/SOLR-5670
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: solr, solrcloud, version
 Fix For: 5.0, 4.7

 Attachments: SOLR-5670.patch, SOLR-5670.patch


 As far as I can see there is no good reason to require that _version_ field 
 has to be indexed if it is docvalued. So I guess it will be ok with a rule 
 saying _version_ has to be either indexed or docvalue (allowed to be both).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5507) Admin UI - Refactoring using AngularJS

2014-01-04 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862393#comment-13862393
 ] 

Yago Riveiro commented on SOLR-5507:


+1 for use bootstrap.

With a UI tool library with component to use as is and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.



 Admin UI - Refactoring using AngularJS
 --

 Key: SOLR-5507
 URL: https://issues.apache.org/jira/browse/SOLR-5507
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Priority: Minor

 On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
 talked about Refactoring the existing UI - using AngularJS: providing (more, 
 internal) structure and what not ;
 He already started working on the Refactoring, so this is more a 'tracking' 
 issue about the progress he/we do there.
 Will extend this issue with a bit more context  additional information, w/ 
 thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5507) Admin UI - Refactoring using AngularJS

2014-01-04 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862393#comment-13862393
 ] 

Yago Riveiro edited comment on SOLR-5507 at 1/4/14 7:39 PM:


+1 for use bootstrap.

With an UI tool library with component to use as is and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.




was (Author: yriveiro):
+1 for use bootstrap.

With a UI tool library with component to use as is and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.



 Admin UI - Refactoring using AngularJS
 --

 Key: SOLR-5507
 URL: https://issues.apache.org/jira/browse/SOLR-5507
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Priority: Minor

 On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
 talked about Refactoring the existing UI - using AngularJS: providing (more, 
 internal) structure and what not ;
 He already started working on the Refactoring, so this is more a 'tracking' 
 issue about the progress he/we do there.
 Will extend this issue with a bit more context  additional information, w/ 
 thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5507) Admin UI - Refactoring using AngularJS

2014-01-04 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862393#comment-13862393
 ] 

Yago Riveiro edited comment on SOLR-5507 at 1/4/14 7:39 PM:


+1 for use bootstrap.

With an UI tool library with components to use as is and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.




was (Author: yriveiro):
+1 for use bootstrap.

With an UI tool library with component to use as is and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.



 Admin UI - Refactoring using AngularJS
 --

 Key: SOLR-5507
 URL: https://issues.apache.org/jira/browse/SOLR-5507
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Priority: Minor

 On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
 talked about Refactoring the existing UI - using AngularJS: providing (more, 
 internal) structure and what not ;
 He already started working on the Refactoring, so this is more a 'tracking' 
 issue about the progress he/we do there.
 Will extend this issue with a bit more context  additional information, w/ 
 thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5507) Admin UI - Refactoring using AngularJS

2013-12-31 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859451#comment-13859451
 ] 

Yago Riveiro commented on SOLR-5507:


[~upayavira], What are the reasons for keeping the two UIs working together?

I understand that rewrite the whole UI is a epic task, but the time that we 
will spend thinking and implementing a way to have the new and the old UI 
working together can be used to finish the new and release it with a new 
release of Solr.

Also, in this transition, we will generate (most probably) new bugs and 
artefacts. With a point of time where we switch between both, all bugs will be 
about new UI.


 Admin UI - Refactoring using AngularJS
 --

 Key: SOLR-5507
 URL: https://issues.apache.org/jira/browse/SOLR-5507
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Priority: Minor

 On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
 talked about Refactoring the existing UI - using AngularJS: providing (more, 
 internal) structure and what not ;
 He already started working on the Refactoring, so this is more a 'tracking' 
 issue about the progress he/we do there.
 Will extend this issue with a bit more context  additional information, w/ 
 thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5507) Admin UI - Refactoring using AngularJS

2013-12-31 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859465#comment-13859465
 ] 

Yago Riveiro commented on SOLR-5507:


Ok, seems a valid argument :D.

If you release de code and some guide line about the architecture of the new 
UI, we can work in this new feature and see it in Solr soon.

 Admin UI - Refactoring using AngularJS
 --

 Key: SOLR-5507
 URL: https://issues.apache.org/jira/browse/SOLR-5507
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Priority: Minor

 On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
 talked about Refactoring the existing UI - using AngularJS: providing (more, 
 internal) structure and what not ;
 He already started working on the Refactoring, so this is more a 'tracking' 
 issue about the progress he/we do there.
 Will extend this issue with a bit more context  additional information, w/ 
 thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)
Yago Riveiro created SOLR-5559:
--

 Summary: DELETE collection command doesn't works in some cases
 Key: SOLR-5559
 URL: https://issues.apache.org/jira/browse/SOLR-5559
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Yago Riveiro






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5559:
---

Description: 
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main 
queue loop

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.

 DELETE collection command doesn't works in some cases
 -

 Key: SOLR-5559
 URL: https://issues.apache.org/jira/browse/SOLR-5559
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Yago Riveiro

 I think that I found a bug in DELETE collectionAPI command.
 Environment:
   - N boxes, the number is not important.
   - A collection with N shard spreed over the N boxes.
   - Solr.xml old style.
   
 I ran the command as 
 http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX
 The command return a 200 all was cleaned and in theory the collection was 
 removed ... but for some reason, one of the boxes doesn't delete the 
 references of CollectionX from the solr.xml and the folders of cores still 
 exists. The clusterstate.json doesn't have the CollectionX and the 
 /collections doesn't show the collectionX either.
 This result of this situation is an exception in overseer queue loop like 
 this:
 org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
 main queue loop
 This exception stuck the queue and stoping the cluster. I think that is easy 
 replicate it with a test.
 I think that before to send an ok in DELETE command we must ensure that 
 nothing about this collection still existing on the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5559:
---

Description: 
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:

{{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
main queue loop}}

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test case.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.

  was:
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:

{{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
main queue loop}}

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.


 DELETE collection command doesn't works in some cases
 -

 Key: SOLR-5559
 URL: https://issues.apache.org/jira/browse/SOLR-5559
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Yago Riveiro

 I think that I found a bug in DELETE collectionAPI command.
 Environment:
   - N boxes, the number is not important.
   - A collection with N shard spreed over the N boxes.
   - Solr.xml old style.
   
 I ran the command as 
 http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX
 The command return a 200 all was cleaned and in theory the collection was 
 removed ... but for some reason, one of the boxes doesn't delete the 
 references of CollectionX from the solr.xml and the folders of cores still 
 exists. The clusterstate.json doesn't have the CollectionX and the 
 /collections doesn't show the collectionX either.
 This result of this situation is an exception in overseer queue loop like 
 this:
 {{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
 main queue loop}}
 This exception stuck the queue and stoping the cluster. I think that is easy 
 replicate it with a test case.
 I think that before to send an ok in DELETE command we must ensure that 
 nothing about this collection still existing on the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5559:
---

Description: 
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:

{{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
main queue loop}}

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.

  was:
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main 
queue loop

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.


 DELETE collection command doesn't works in some cases
 -

 Key: SOLR-5559
 URL: https://issues.apache.org/jira/browse/SOLR-5559
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Yago Riveiro

 I think that I found a bug in DELETE collectionAPI command.
 Environment:
   - N boxes, the number is not important.
   - A collection with N shard spreed over the N boxes.
   - Solr.xml old style.
   
 I ran the command as 
 http://localhost:8983/sorl/admin/collections?action=DELETEname=CollectionX
 The command return a 200 all was cleaned and in theory the collection was 
 removed ... but for some reason, one of the boxes doesn't delete the 
 references of CollectionX from the solr.xml and the folders of cores still 
 exists. The clusterstate.json doesn't have the CollectionX and the 
 /collections doesn't show the collectionX either.
 This result of this situation is an exception in overseer queue loop like 
 this:
 {{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
 main queue loop}}
 This exception stuck the queue and stoping the cluster. I think that is easy 
 replicate it with a test.
 I think that before to send an ok in DELETE command we must ensure that 
 nothing about this collection still existing on the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-06 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841870#comment-13841870
 ] 

Yago Riveiro commented on SOLR-4260:


Replicas are still losing docs in Solr 4.6 :(.

I'm wondering if we can't have a pair (version, numDocs) to track the 
increments of docs between versions. Also we can save the last 10 tlogs in each 
replica as backups after be commited and make a diff to see what is missing in 
case the replicas are out of sync, replay the transaction and avoid a not 
synchronized replica and a full-recovery that probably will be heaviest that 
make the diff.

It's only and idea and of course find the bug must be the priority.

This issue compromisse Solr to be the main storage. If re-index data is not 
possible, we can't guarantee that no data is missing,  and worse, we lost the 
data forever :(.


 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0, 4.7

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks

2013-11-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827764#comment-13827764
 ] 

Yago Riveiro commented on SOLR-5477:


Related with this feature we can add a notification panel in the UI.

 Async execution of OverseerCollectionProcessor tasks
 

 Key: SOLR-5477
 URL: https://issues.apache.org/jira/browse/SOLR-5477
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul

 Typical collection admin commands are long running and it is very common to 
 have the requests get timed out.  It is more of a problem if the cluster is 
 very large.Add an option to run these commands asynchronously
 add an extra param async=true for all collection commands
 the task is written to ZK and the caller is returned a task id. 
 as separate collection admin command will be added to poll the status of the 
 task
 command=statusid=7657668909
 if id is not passed all running async tasks should be listed
 A separate queue is created to store in-process tasks . After the tasks are 
 completed the queue entry is removed. OverSeerColectionProcessor will perform 
 these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827788#comment-13827788
 ] 

Yago Riveiro commented on SOLR-5428:


I think that analytics component doesn't support distributed queries.

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5428.patch, SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827816#comment-13827816
 ] 

Yago Riveiro commented on SOLR-5428:


For me the utility of this patch is about the possibility to get distinctValues 
and countDistinct in a distribute environment. If it's possible implement this 
patch on top of AnalyticComponent I think that should be done,  by the simple 
fact that, eventually, the StatsComponent will be deprecated.

The question is that 
[SOLR-5302|https://issues.apache.org/jira/browse/SOLR-5302] will not be 
released soon, maybe in Solr 5.0, and in some way this patch is straightforward 
enough that can be released in Solr 4.7 with some tweaks. 

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5428.patch, SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-19 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826356#comment-13826356
 ] 

Yago Riveiro commented on SOLR-4260:


It's safe upgrade from 4.5.1 to 4.6?. I have docValues and I read that it's not 
linear upgraded and I can't reindex the data.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-19 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826533#comment-13826533
 ] 

Yago Riveiro commented on SOLR-4260:


I'm using  codecFactory class=solr.SchemaCodecFactory/ to enable per-field 
DocValues formats.

I think that this aspect about docValues it doesn't  explained on wiki in a 
proper way. There is no example how we can do the switch to default, do the 
forceMerge and switch back to the original implementation.

If I can't have the security that all will work fine,  I can't do the upgrade.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-15 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823517#comment-13823517
 ] 

Yago Riveiro commented on SOLR-4260:


Jessica, 

In some point of the process the leader can be downgraded to replica, the other 
replica whit less document will become the leader, in this case, the older 
leader (after the recovery) can be updated as usual and you get the leader 
behind the replica if the recovery doesn't fix the desviation.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-15 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823867#comment-13823867
 ] 

Yago Riveiro commented on SOLR-4260:


{quote}I thought the updates are synchronously distributed{quote}

My knowledge about how replication is done is very limited, for me replication 
is a distributed HTTP requests to all replicas, if all responses return the 
code 200, then the insertion was successful. I don't know if internally the 200 
is returned when the document is written on tlog or in the open segment.

Up-to-date in this case is none, you have your data compromised, you can't 
guarantee wich is the correct replica, the logic could be pick the replica with 
more docs and make a new replica using it, but still can know without check one 
by one if you have all data. An extreme case can be do a full reindex of the 
data (if you can).


 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-15 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823887#comment-13823887
 ] 

Yago Riveiro commented on SOLR-4260:


Mark, 

I can confirm that I had session expirations in my logs in some point of time. 
My index rate is high and some times my boxes are under some pressure.

My problem is that I don't know how deal with the situation. I'm using a non 
java client and I don't know how I can do debug or the tools that I can use to 
give some information to help debug this issue.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-14 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822470#comment-13822470
 ] 

Yago Riveiro commented on SOLR-5428:


Collect the distinctValues can be expensive but in my case is a requirement 
that Solr can't give me in a easy way. I need to do a facet query limit -1 to 
get all uniq terms that match the query.

If the StatsComponent can do the same thing, expensive or not, I vote to have 
the feature. The way how use it and the pros and cons of use it must be a 
decision made by the user.

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-14 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822612#comment-13822612
 ] 

Yago Riveiro commented on SOLR-5428:


Ok, I forgot that the StatsComponent return all metrics in one call.

Maybe the StatsCompement needs some tweaking to only return the metrics that we 
need and not all. If the analytics component could working with distributed 
searchs this patch would not necessary.

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-13 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821371#comment-13821371
 ] 

Yago Riveiro commented on SOLR-5428:


This tiny patch is very very useful.

One question, in the case of the Stats component, Is all work done on the heap 
or leverages the benefits of docValues?

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-13 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821375#comment-13821375
 ] 

Yago Riveiro commented on SOLR-4260:


{quote} Currently, if shards eventually get out of whack, the best you can do 
is trigger a new recovery against the leader.{quote}

What happen when the leader is the shard with less docs? Is the replication 
done in the right way?

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-12 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820159#comment-13820159
 ] 

Yago Riveiro commented on SOLR-5428:


This patch works in distribute queries?

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
 Attachments: SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813827#comment-13813827
 ] 

Yago Riveiro commented on SOLR-4260:


Hi, I hit this bug with solr 4.5.1

replica 1:

lastModified:20 minutes ago
version:80616
numDocs:6072661
maxDoc:6072841
deletedDocs:180

replica 2 (leader)

lastModified:20 minutes ago
version:77595
numDocs:6072575
maxDoc:6072771
deletedDocs:196

I don't know when this happened, therefore I have no time frame to find in log 
valuable information on logs.

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-4260:
---

Attachment: 192.168.20.102-replica1.png
192.168.20.104-replica2.png
clusterstate.png

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813850#comment-13813850
 ] 

Yago Riveiro commented on SOLR-4260:


I attached some screenshots

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813850#comment-13813850
 ] 

Yago Riveiro edited comment on SOLR-4260 at 11/5/13 11:14 AM:
--

I attached some screenshots

The shard in question is the shard11:

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?


was (Author: yriveiro):
I attached some screenshots

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813850#comment-13813850
 ] 

Yago Riveiro edited comment on SOLR-4260 at 11/5/13 11:15 AM:
--

I attached some screenshots

The shard is the shard11:

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?


was (Author: yriveiro):
I attached some screenshots

The shard in question is the shard11:

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?

 Inconsistent numDocs between leader and replica
 ---

 Key: SOLR-4260
 URL: https://issues.apache.org/jira/browse/SOLR-4260
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.0
 Environment: 5.0.0.2013.01.04.15.31.51
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: 192.168.20.102-replica1.png, 
 192.168.20.104-replica2.png, clusterstate.png


 After wiping all cores and reindexing some 3.3 million docs from Nutch using 
 CloudSolrServer we see inconsistencies between the leader and replica for 
 some shards.
 Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
 a small deviation in then number of documents. The leader and slave deviate 
 for roughly 10-20 documents, not more.
 Results hopping ranks in the result set for identical queries got my 
 attention, there were small IDF differences for exactly the same record 
 causing a record to shift positions in the result set. During those tests no 
 records were indexed. Consecutive catch all queries also return different 
 number of numDocs.
 We're running a 10 node test cluster with 10 shards and a replication factor 
 of two and frequently reindex using a fresh build from trunk. I've not seen 
 this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5381) Split Clusterstate and scale

2013-10-30 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809184#comment-13809184
 ] 

Yago Riveiro commented on SOLR-5381:


{quote}
There will be a separate thread for each external collection
{quote}

If we have 100K collections means that we need 100K threads? 

They are spread around the all machines of the cluster but it's still too much.

I can be wrong but If we have 100K collections and only a 10% active at a time, 
we need allocate resource to the 100K theads.

Is it not possible have a pool with X threads (X can be configurable) that 
treats external collections?

 Split Clusterstate and scale 
 -

 Key: SOLR-5381
 URL: https://issues.apache.org/jira/browse/SOLR-5381
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
   Original Estimate: 2,016h
  Remaining Estimate: 2,016h

 clusterstate.json is a single point of contention for all components in 
 SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
 because there are too many updates and too many nodes need to be notified of 
 the changes. As the no:of nodes go up the size of clusterstate.json keeps 
 going up and it will soon exceed the limit impossed by ZK.
 The first step is to store the shards information in separate nodes and each 
 node can just listen to the shard node it belongs to. We may also need to 
 split each collection into its own node and the clusterstate.json just 
 holding the names of the collections .
 This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5381) Split Clusterstate and scale

2013-10-24 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804240#comment-13804240
 ] 

Yago Riveiro commented on SOLR-5381:


SolrCloud environment is young and has some bugs but is relatively stable, make 
an epic refactoring can be worse than the actual scenario. Stability must be 
the goal.


 Split Clusterstate and scale 
 -

 Key: SOLR-5381
 URL: https://issues.apache.org/jira/browse/SOLR-5381
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
   Original Estimate: 2,016h
  Remaining Estimate: 2,016h

 clusterstate.json is a single point of contention for all components in 
 SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
 because there are too many updates and too many nodes need to be notified of 
 the changes. As the no:of nodes go up the size of clusterstate.json keeps 
 going up and it will soon exceed the limit impossed by ZK.
 The first step is to store the shards information in separate nodes and each 
 node can just listen to the shard node it belongs to. We may also need to 
 split each collection into its own node and the clusterstate.json just 
 holding the names of the collections .
 This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5381) Split Clusterstate and scale

2013-10-23 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803499#comment-13803499
 ] 

Yago Riveiro commented on SOLR-5381:


I hit the ZK limit of 1M for node with more than 10K with 3 shard and 
replicationFactor=2.

I found a workaround for this using the -Djute.maxbuffer parameter configured 
on ZK and Solr, but the ZK's documentation says that can be unstable.

I don't know if the fact of have a clusterstate.json with so many collections 
can degrade the performance, but is too difficult to manage.

If each collection had its own clusterstate.json, maybe migrate collection to 
other cluster will be more easy, you only need to copy the clusterstate to 
other cluster, the folders of cores and it's done. You had a problematic 
collection with its own resources.




 Split Clusterstate and scale 
 -

 Key: SOLR-5381
 URL: https://issues.apache.org/jira/browse/SOLR-5381
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
   Original Estimate: 2,016h
  Remaining Estimate: 2,016h

 clusterstate.json is a single point of contention for all components in 
 SolrCloud. It would be hard to scale SolrCloud beyond a few thousand nodes 
 because there are too many updates and too many nodes need to be notified of 
 the changes. As the no:of nodes go up the size of clusterstate.json keeps 
 going up and it will soon exceed the limit impossed by ZK.
 The first step is to store the shards information in separate nodes and each 
 node can just listen to the shard node it belongs to. We may also need to 
 split each collection into its own node and the clusterstate.json just 
 holding the names of the collections .
 This is an umbrella issue



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5317) CoreAdmin API is not persisting data properly

2013-10-08 Thread Yago Riveiro (JIRA)
Yago Riveiro created SOLR-5317:
--

 Summary: CoreAdmin API is not persisting data properly
 Key: SOLR-5317
 URL: https://issues.apache.org/jira/browse/SOLR-5317
 Project: Solr
  Issue Type: Bug
Reporter: Yago Riveiro
Priority: Critical


There is a regression between 4.4 and 4.5 with the CoreAdmin API, the command 
doesn't save the result on solr.xml at time that is executed.

The full process is describe here: https://gist.github.com/yriveiro/6883208



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-06 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730995#comment-13730995
 ] 

Yago Riveiro commented on SOLR-5081:


I have this problem too, but in my case, Solr hangs and I can done more 
insertions without restart the nodes.

I do the insertion using a culr post in json format like:

curl http://127.0.0.1:8983/solr/collection/update --data-binary @data -H 
'Content-type:application/json'

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4753) SEVERE: Too many close [count:-1] for SolrCore in logs (4.2.1)

2013-05-08 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652018#comment-13652018
 ] 

Yago Riveiro commented on SOLR-4753:


Hi,

Today I upgrade solr to 4.3 and I had the same issue.


{noformat}
65907 4159431 [node02.solrcloud-startStop-1-EventThread] INFO  
org.apache.solr.common.cloud.ZkStateReader  – A cluster state change: 
WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, 
has occurred - updating... (live nodes size: 4)
65908 4159698 [RecoveryThread] INFO  org.apache.solr.cloud.RecoveryStrategy  – 
Finished recovery process. core=ST-0712
65909 4159698 [RecoveryThread] INFO  org.apache.solr.core.SolrCore  – [ST-0712] 
 CLOSING SolrCore org.apache.solr.core.SolrCore@73e004a
65910 4159698 [RecoveryThread] INFO  org.apache.solr.update.UpdateHandler  – 
closing DirectUpdateHandler2{commits=0,autocommit maxDocs=5000,autocommit 
maxTime=1ms,autocommits=0,soft autocommit maxTime=2500ms,soft 
autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
65911 4159699 [RecoveryThread] INFO  org.apache.solr.core.SolrCore  – [ST-0712] 
Closing main searcher on request.
65912 4159699 [catalina-exec-12] ERROR org.apache.solr.core.SolrCore  – Too 
many close [count:-1] on org.apache.solr.core.SolrCore@73e004a. Please report 
this exception to solr-u...@lucene.apache.org
65913 4159699 [catalina-exec-12] INFO  org.apache.solr.core.CoreContainer  – 
Persisting cores config to /opt/node02.solrcloud/solr/home/solr.xml
65914 4160185 [catalina-exec-12] INFO  org.apache.solr.core.SolrXMLSerializer  
– Persisting cores config to /opt/node02.solrcloud/solr/home/solr.xml
{noformat}

 SEVERE: Too many close [count:-1] for SolrCore in logs (4.2.1)
 --

 Key: SOLR-4753
 URL: https://issues.apache.org/jira/browse/SOLR-4753
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 5.0, 4.4


 a user reported core reference counting issues in 4.2.1...
 http://markmail.org/message/akrrj5o24prasm6e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2013-05-07 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13651066#comment-13651066
 ] 

Yago Riveiro commented on SOLR-4793:


A workaround for this is set the param -Djute.maxbuffer with a value greater 
than 1M.

From zookeeper 
[doc|http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html#sc_configuration]

{quote}
jute.maxbuffer:
(Java system property: jute.maxbuffer)

This option can only be set as a Java system property. There is no zookeeper 
prefix on it. It specifies the maximum size of the data that can be stored in a 
znode. The default is 0xf, or just under 1M. If this option is changed, the 
system property must be set on all servers and clients otherwise problems will 
arise. This is really a sanity check. ZooKeeper is designed to store data on 
the order of kilobytes in size.
{quote}

Notice that to use this configuration param, is necessary set this param in 
solrcloud and zookeeper init script.

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >