from:"Yago Riveiro"

Re: Call for help: moving from ant build to gradle

2018-11-05 Thread Yago Riveiro

Yago Riveiro smiled at you
Spark by Readdle

Re: [jira] [Created] (LUCENE-8264) Allow an option to rewrite all segments

2018-04-22 Thread Yago Riveiro

Hi Erick,

“re-index from scratch” was always the main concern in every mayor update, our 
cluster has ~15T of data and re-index without downtime is always an epic task, 
not only in time but also in temporal resources to have all responsive.

It’s difficult for me justify to my boss that every time we hit a bug, we need 
to upgrade to a mayor version and do a full re-index.

If this issue can resolve some of the pains of upgrading to a mayor version, 
will be very welcome.

Regards

--

Yago Riveiro

On 22 Apr 2018 04:34 +0100, Erick Erickson (JIRA) , wrote:
> Erick Erickson created LUCENE-8264:
> --
>
> Summary: Allow an option to rewrite all segments
> Key: LUCENE-8264
> URL: https://issues.apache.org/jira/browse/LUCENE-8264
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Erick Erickson
> Assignee: Erick Erickson
>
>
> For the background, see SOLR-12259.
>
> There are several use-cases that would be much easier, especially during 
> upgrades, if we could specify that all segments get rewritten.
>
> One example: Upgrading 5x->6x->7x. When segments are merged, they're 
> rewritten into the current format. However, there's no guarantee that a 
> particular segment _ever_ gets merged so the 6x-7x upgrade won't necessarily 
> be successful.
>
> How many merge policies support this is an open question. I propose to start 
> with TMP and raise other JIRAs as necessary for other merge policies.
>
> So far the usual response has been "re-index from scratch", but that's 
> increasingly difficult as systems get larger.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

[jira] [Comment Edited] (SOLR-10987) Solr Cloud overseer node becomes unreachable. Issue Started Recently

2017-07-03 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072988#comment-16072988
 ] 

Yago Riveiro edited comment on SOLR-10987 at 7/4/17 12:14 AM:
--

If the deploy didn't change at all in the last 7 months should be an external 
cause, network issues, GC pauses due a lack of resources ...

You should post this kind of issues first in the mailing list before open a 
ticket, you will have more visibility and a faster response.






was (Author: yriveiro):
If the deploy didn't change at all in the last 7 months should be an external 
cause, network issues, GC pauses due a lack of resources ...

You should post this things first in the mailing list before open a ticket, you 
will have more visibility and a faster response.





> Solr Cloud overseer node becomes unreachable. Issue Started Recently
> 
>
> Key: SOLR-10987
> URL: https://issues.apache.org/jira/browse/SOLR-10987
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: *The following is the usage on each of the Solr Nodes:*
> Tasks: 254 total,   1 running, 252 sleeping,   0 stopped,   1 zombie
> %Cpu(s):  0.4 us,  0.3 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 
> st
> KiB Mem : 20392276 total,  4169296 free,  2917012 used, 13305968 buff/cache
> KiB Swap:  5111804 total,  5111636 free,  168 used. 16058184 avail Mem
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 21250 solr  20   0 23.599g 1.184g 228440 S   2.0  6.1  59:55.91 java
> *Solr is running on 5 machines with similar configuration:*
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):4
> On-line CPU(s) list:   0-3
> Thread(s) per core:1
> Core(s) per socket:2
> Socket(s): 2
> NUMA node(s):  1
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 62
> Model name:Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
> Stepping:  4
> CPU MHz:   2799.033
> BogoMIPS:  5600.00
> Hypervisor vendor: VMware
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0-3
>Reporter: RAHAT BHALLA
>  Labels: assistance, critical, customer, impacting, issue, need, 
> production
>
> We host a Solr Cloud of 5 Nodes for Solr Instances and 3 Zookeeper nodes to 
> maintain the cloud. We have over 70 million docs spread across 13 collections 
> with 40K more documents being added every day almost near time within spans 
> of 5 to 6 minutes.
> The System was working as expected and as required for th elast 7 months 
> until suddenly we saw the following exception and all of our instances went 
> offline. We restarted the instances and the cloud ran smoothly for three days 
> before it came crashing down again.
> *Exception It gives before it goes down is as follows:*
> 3542285 ERROR 
> (OverseerCollectionConfigSetProcessor-98221003671470081-prod-solr-node01:9080_solr-n_000106)
>  [   ] o.a.s.c.OverseerTaskProcessor
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /overseer_elect/leader
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
> at 
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
> at 
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:345)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.amILeader(OverseerTaskProcessor.java:384)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.run(OverseerTaskProcessor.java:191)
> at java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10987) Solr Cloud overseer node becomes unreachable. Issue Started Recently

2017-07-03 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072988#comment-16072988
 ] 

Yago Riveiro commented on SOLR-10987:
-

If the deploy didn't change at all in the last 7 months should be an external 
cause, network issues, GC pauses due a lack of resources ...

You should post this things first in the mailing list before open a ticket, you 
will have more visibility and a faster response.





> Solr Cloud overseer node becomes unreachable. Issue Started Recently
> 
>
> Key: SOLR-10987
> URL: https://issues.apache.org/jira/browse/SOLR-10987
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: *The following is the usage on each of the Solr Nodes:*
> Tasks: 254 total,   1 running, 252 sleeping,   0 stopped,   1 zombie
> %Cpu(s):  0.4 us,  0.3 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 
> st
> KiB Mem : 20392276 total,  4169296 free,  2917012 used, 13305968 buff/cache
> KiB Swap:  5111804 total,  5111636 free,  168 used. 16058184 avail Mem
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 21250 solr  20   0 23.599g 1.184g 228440 S   2.0  6.1  59:55.91 java
> *Solr is running on 5 machines with similar configuration:*
> Architecture:  x86_64
> CPU op-mode(s):32-bit, 64-bit
> Byte Order:Little Endian
> CPU(s):4
> On-line CPU(s) list:   0-3
> Thread(s) per core:1
> Core(s) per socket:2
> Socket(s): 2
> NUMA node(s):  1
> Vendor ID: GenuineIntel
> CPU family:6
> Model: 62
> Model name:Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
> Stepping:  4
> CPU MHz:   2799.033
> BogoMIPS:  5600.00
> Hypervisor vendor: VMware
> Virtualization type:   full
> L1d cache: 32K
> L1i cache: 32K
> L2 cache:  256K
> L3 cache:  25600K
> NUMA node0 CPU(s): 0-3
>Reporter: RAHAT BHALLA
>  Labels: assistance, critical, customer, impacting, issue, need, 
> production
>
> We host a Solr Cloud of 5 Nodes for Solr Instances and 3 Zookeeper nodes to 
> maintain the cloud. We have over 70 million docs spread across 13 collections 
> with 40K more documents being added every day almost near time within spans 
> of 5 to 6 minutes.
> The System was working as expected and as required for th elast 7 months 
> until suddenly we saw the following exception and all of our instances went 
> offline. We restarted the instances and the cloud ran smoothly for three days 
> before it came crashing down again.
> *Exception It gives before it goes down is as follows:*
> 3542285 ERROR 
> (OverseerCollectionConfigSetProcessor-98221003671470081-prod-solr-node01:9080_solr-n_000106)
>  [   ] o.a.s.c.OverseerTaskProcessor
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /overseer_elect/leader
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
> at 
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
> at 
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
> at 
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:345)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.amILeader(OverseerTaskProcessor.java:384)
> at 
> org.apache.solr.cloud.OverseerTaskProcessor.run(OverseerTaskProcessor.java:191)
> at java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests

2017-06-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16046593#comment-16046593
 ] 

Yago Riveiro commented on SOLR-9824:


Will this backported to 6.x branch?

> Documents indexed in bulk are replicated using too many HTTP requests
> -
>
> Key: SOLR-9824
> URL: https://issues.apache.org/jira/browse/SOLR-9824
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.3
>Reporter: David Smiley
>Assignee: Mark Miller
> Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch, 
> SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch, 
> SOLR-9824-tflobbe.patch
>
>
> This takes awhile to explain; bear with me. While working on bulk indexing 
> small documents, I looked at the logs of my SolrCloud nodes.  I noticed that 
> shards would see an /update log message every ~6ms which is *way* too much.  
> These are requests from one shard (that isn't a leader/replica for these docs 
> but the recipient from my client) to the target shard leader (no additional 
> replicas).  One might ask why I'm not sending docs to the right shard in the 
> first place; I have a reason but it's besides the point -- there's a real 
> Solr perf problem here and this probably applies equally to 
> replicationFactor>1 situations too.  I could turn off the logs but that would 
> hide useful stuff, and it's disconcerting to me that so many short-lived HTTP 
> requests are happening, somehow at the bequest of DistributedUpdateProcessor. 
>  After lots of analysis and debugging and hair pulling, I finally figured it 
> out.  
> In SOLR-7333 ([~tpot]) introduced an optimization called 
> {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will 
> poll with a '0' timeout to the internal queue, so that it can close the 
> connection without it hanging around any longer than needed.  This part makes 
> sense to me.  Currently the only spot that has the smarts to set this flag is 
> {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the 
> last document.  So if a shard received docs in a javabin stream (but not 
> other formats) one would expect the _last_ document to have this flag.  
> There's even a test.  Docs without this flag get the default poll time; for 
> javabin it's 25ms.  Okay.
> I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send 
> javabin data in a batch, the intended efficiencies of SOLR-7333 would apply.  
> I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW 
> DistributedUpdateProcessor uses CUSC too).  CUSC uses the RequestWriter 
> (defaulting to javabin) to send each document separately without any leading 
> marker or trailing marker.  For the XML format by comparison, there is a 
> leading and trailing marker ( ... ).  Since there's no outer 
> container for the javabin unmarshalling to detect the last document, it marks 
> _every_ document as {{req.lastDocInBatch()}}!  Ouch!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Commented] (SOLR-4987) Add reindex API

2017-05-14 Thread Yago Riveiro

The only limitation that I can see for this is the fact that the fl param 
doesn’t support wild cards like * and you need to pass the full set of fields.

This doesn’t play well with indexed where documents have not homogeneous schema.

--

/Yago Riveiro

On 15 May 2017 00:57 +0100, Joel Bernstein (JIRA) , wrote:
>
> [ 
> https://issues.apache.org/jira/browse/SOLR-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009888#comment-16009888
>  ]
>
> Joel Bernstein commented on SOLR-4987:
> --
>
> Streaming Expressions are very strong at re-indexing. Stored fields and 
> docValues are both supported. Massive amounts of data can be moved quickly 
> using the techniques described here:
>
> http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html
>
> > Add reindex API
> > ---
> >
> > Key: SOLR-4987
> > URL: https://issues.apache.org/jira/browse/SOLR-4987
> > Project: Solr
> > Issue Type: New Feature
> > Affects Versions: 4.3.1
> > Reporter: Shawn Heisey
> > Priority: Minor
> >
> > A lot of users ask "how do I reindex?" We have a wiki page that explains 
> > this:
> > http://wiki.apache.org/solr/HowToReindex
> > For many such users, this is not acceptable. They assume that once the 
> > information is in Solr, it should be possible for Solr to change how it's 
> > indexed at the touch of a button. I don't think they like it when they are 
> > told that it's not possible.
> > Perhaps it's time to give these users what they want -- if they store all 
> > fields that are not copyField destinations.
> > A note to people who find this issue: Until it is marked Resolved with a 
> > "Fixed" or "Implemented" notation, be aware that this has not yet happened, 
> > and there is no guarantee that it WILL happen.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

[jira] [Commented] (SOLR-10150) Solr 6.4 up to 10x slower than 6.3

2017-02-16 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870363#comment-15870363
 ] 

Yago Riveiro commented on SOLR-10150:
-

I think this is a duplicate of SOLR-10130

> Solr 6.4 up to 10x slower than 6.3
> --
>
> Key: SOLR-10150
> URL: https://issues.apache.org/jira/browse/SOLR-10150
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Fabrizio Fortino
>Priority: Critical
>  Labels: performance
> Attachments: Screen Shot 2017-02-16 at 17.31.02.png
>
>
> We noticed a considerable performance degradation (5x to 10x) using Solr 6.4 
> and huge increase of CPU utilization. Our use case is pretty simple: we have 
> a single Solr Core with around 600K small size documents. We just do lookups 
> by key (no full text searches) and use faceting capabilities.
> Using the Solr Admin Thread Dump utilities we noticed a lot of threads using 
> considerable cpuTime / userTime on codehale metrics (snapshot attached). The 
> metrics part has been drastically changed in 6.4 
> (https://issues.apache.org/jira/browse/SOLR-4735). Rolling back to Solr 6.3 
> has solved our performance problems.
> Is there any way to disable these metrics in version 6.4 ?
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Commented] (SOLR-8029) Modernize and standardize Solr APIs

2017-01-27 Thread Yago Riveiro

No command to list aliases?

Call CLUSTERSTATUS to fetch the available list of alias is annoying, aliases 
belongs to collections not to the CLUSTER.

--

/Yago Riveiro

On 27 Jan 2017 23:01 +, Noble Paul (JIRA) , wrote:
>
> [ 
> https://issues.apache.org/jira/browse/SOLR-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843618#comment-15843618
>  ]
>
> Noble Paul commented on SOLR-8029:
> --
>
> Refer to this document. This is not fully updated and some of the items 
> marked as "missing" are actually implemented.
> https://docs.google.com/document/d/18n9IL6y82C8gnBred6lzG0GLaT3OsZZsBvJQ2YAt72I/edit?usp=sharing
>
> > Modernize and standardize Solr APIs
> > ---
> >
> > Key: SOLR-8029
> > URL: https://issues.apache.org/jira/browse/SOLR-8029
> > Project: Solr
> > Issue Type: Improvement
> > Affects Versions: 6.0
> > Reporter: Noble Paul
> > Assignee: Noble Paul
> > Labels: API, EaseOfUse
> > Fix For: 6.0
> >
> > Attachments: SOLR-8029.patch, SOLR-8029.patch, SOLR-8029.patch, 
> > SOLR-8029.patch, SOLR-8029.patch
> >
> >
> > Solr APIs have organically evolved and they are sometimes inconsistent with 
> > each other or not in sync with the widely followed conventions of HTTP 
> > protocol. Trying to make incremental changes to make them modern is like 
> > applying band-aid. So, we have done a complete rethink of what the APIs 
> > should be. The most notable aspects of the API are as follows:
> > The new set of APIs will be placed under a new path {{/solr2}}. The legacy 
> > APIs will continue to work under the {{/solr}} path as they used to and 
> > they will be eventually deprecated.
> > There are 4 types of requests in the new API
> > * {{/v2//*}} : Hit a collection directly or manage 
> > collections/shards/replicas
> > * {{/v2//*}} : Hit a core directly or manage cores
> > * {{/v2/cluster/*}} : Operations on cluster not pertaining to any 
> > collection or core. e.g: security, overseer ops etc
> > This will be released as part of a major release. Check the link given 
> > below for the full specification. Your comments are welcome
> > [Solr API version 2 Specification | http://bit.ly/1JYsBMQ]
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

Re: [jira] [Commented] (SOLR-6761) Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.

2017-01-23 Thread Yago Riveiro

"I can’t”

Sorry for the typo

--

/Yago Riveiro

On 23 Jan 2017 16:50 +0000, Yago Riveiro , wrote:
> -1 for this.
>
> I don’t understand why I can do a optimize over a collection when I want, 
> with or without penalty of performance.
>
>
> In indexes with a high ratio of deletes the only way to reclaim space is with 
> the optimize command, and yes, sometimes I need to run this command to 
> reclaim 100 or 200G of space from my SSD’s … And yes I know that merge 
> operations removes the deletes, but if you have segments with 100G, and 
> deletes on them you never going to hit this segment in a merge until merge 
> the X others that the index has.
>
> Recommendation shouldn’t be enforced, this is the main point of a 
> “recommendation”.
>
> Regards,
>
> --
>
> /Yago Riveiro
>
> On 23 Jan 2017 14:26 +, karney luo (JIRA) , wrote:
> >
> > [ 
> > https://issues.apache.org/jira/browse/SOLR-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834652#comment-15834652
> >  ]
> >
> > karney luo commented on SOLR-6761:
> > --
> >
> > 谢谢您的来信，罗昆仲已经收到
> >
> >
> > > Ability to ignore commit and optimize requests from clients when running 
> > > in SolrCloud mode.
> > > ---
> > >
> > > Key: SOLR-6761
> > > URL: https://issues.apache.org/jira/browse/SOLR-6761
> > > Project: Solr
> > > Issue Type: New Feature
> > > Components: SolrCloud, SolrJ
> > > Reporter: Timothy Potter
> > > Assignee: Timothy Potter
> > > Fix For: 5.0, 6.0
> > >
> > > Attachments: SOLR-6761.patch, SOLR-6761.patch
> > >
> > >
> > > In most SolrCloud environments, it's advisable to only rely on 
> > > auto-commits (soft and hard) configured in solrconfig.xml and not send 
> > > explicit commit requests from client applications. In fact, I've seen 
> > > cases where improperly coded client applications can send commit requests 
> > > too frequently, which can lead to harming the cluster's health.
> > > As a system administrator, I'd like the ability to disallow commit 
> > > requests from client applications. Ideally, I could configure the 
> > > updateHandler to ignore the requests and return an HTTP response code of 
> > > my choosing as I may not want to break existing client applications by 
> > > returning an error. In other words, I may want to just return 200 vs. 
> > > 405. The same goes for optimize requests.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >

Re: [jira] [Commented] (SOLR-6761) Ability to ignore commit and optimize requests from clients when running in SolrCloud mode.

2017-01-23 Thread Yago Riveiro

-1 for this.

I don’t understand why I can do a optimize over a collection when I want, with 
or without penalty of performance.


In indexes with a high ratio of deletes the only way to reclaim space is with 
the optimize command, and yes, sometimes I need to run this command to reclaim 
100 or 200G of space from my SSD’s … And yes I know that merge operations 
removes the deletes, but if you have segments with 100G, and deletes on them 
you never going to hit this segment in a merge until merge the X others that 
the index has.

Recommendation shouldn’t be enforced, this is the main point of a 
“recommendation”.

Regards,

--

/Yago Riveiro

On 23 Jan 2017 14:26 +, karney luo (JIRA) , wrote:
>
> [ 
> https://issues.apache.org/jira/browse/SOLR-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834652#comment-15834652
>  ]
>
> karney luo commented on SOLR-6761:
> --
>
> 谢谢您的来信，罗昆仲已经收到
>
>
> > Ability to ignore commit and optimize requests from clients when running in 
> > SolrCloud mode.
> > ---
> >
> > Key: SOLR-6761
> > URL: https://issues.apache.org/jira/browse/SOLR-6761
> > Project: Solr
> > Issue Type: New Feature
> > Components: SolrCloud, SolrJ
> > Reporter: Timothy Potter
> > Assignee: Timothy Potter
> > Fix For: 5.0, 6.0
> >
> > Attachments: SOLR-6761.patch, SOLR-6761.patch
> >
> >
> > In most SolrCloud environments, it's advisable to only rely on auto-commits 
> > (soft and hard) configured in solrconfig.xml and not send explicit commit 
> > requests from client applications. In fact, I've seen cases where 
> > improperly coded client applications can send commit requests too 
> > frequently, which can lead to harming the cluster's health.
> > As a system administrator, I'd like the ability to disallow commit requests 
> > from client applications. Ideally, I could configure the updateHandler to 
> > ignore the requests and return an HTTP response code of my choosing as I 
> > may not want to break existing client applications by returning an error. 
> > In other words, I may want to just return 200 vs. 405. The same goes for 
> > optimize requests.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

[jira] [Commented] (SOLR-7191) Improve stability and startup performance of SolrCloud with thousands of collections

2017-01-21 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832908#comment-15832908
 ] 

Yago Riveiro commented on SOLR-7191:


Restarting a node in 6.3 now takes forever ... I bumped coreLoadThreads from 4 
to 512 and restarting a node with 1500 collections takes 20 - 25 minutes. If I 
bump coreLoadThreads to 1024 or 2048 is faster, but some times replicas stay in 
a wrong state and never go up.

Other thing that I see happen now is collections created without replicas.


Shawn where can I raise maxThreads of jetty?

> Improve stability and startup performance of SolrCloud with thousands of 
> collections
> 
>
> Key: SOLR-7191
> URL: https://issues.apache.org/jira/browse/SOLR-7191
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
>Reporter: Shawn Heisey
>Assignee: Noble Paul
>  Labels: performance, scalability
> Fix For: 6.3
>
> Attachments: lots-of-zkstatereader-updates-branch_5x.log, 
> SOLR-7191.patch, SOLR-7191.patch, SOLR-7191.patch, SOLR-7191.patch, 
> SOLR-7191.patch, SOLR-7191.patch, SOLR-7191.patch
>
>
> A user on the mailing list with thousands of collections (5000 on 4.10.3, 
> 4000 on 5.0) is having severe problems with getting Solr to restart.
> I tried as hard as I could to duplicate the user setup, but I ran into many 
> problems myself even before I was able to get 4000 collections created on a 
> 5.0 example cloud setup.  Restarting Solr takes a very long time, and it is 
> not very stable once it's up and running.
> This kind of setup is very much pushing the envelope on SolrCloud performance 
> and scalability.  It doesn't help that I'm running both Solr nodes on one 
> machine (I started with 'bin/solr -e cloud') and that ZK is embedded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10015) Remove strong reference to Field Cache key (optional) so that GC can release some Field Cache entries when Solr is under memory pressure

2017-01-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832512#comment-15832512
 ] 

Yago Riveiro commented on SOLR-10015:
-

Honestly, I would prefer a degradation of performance that an OOM (where you 
will lost your cache anyway ...)

> Remove strong reference to Field Cache key (optional) so that GC can release 
> some Field Cache entries when Solr is under memory pressure
> 
>
> Key: SOLR-10015
> URL: https://issues.apache.org/jira/browse/SOLR-10015
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-10015-prototype.patch
>
>
> In current Field Cache (FC) implementation, a WeakHashMap is used, supposedly 
> designed to allow GC to release some Field Cache entries when Solr is under 
> memory pressure. 
> However, in practice, FC entry releasing seldom happens. Even worse, sometime 
> Solr goes OOM and heap dump shows large amount of memory is actually used by 
> FC. It's a sign that GC is not able to release FC entries even WeakHashMap is 
> used.
> The reason is that FC is using SegmentCoreReaders as the key to the 
> WeakHashMap. However, SegmentCoreReaders is usually strong referenced by 
> SegmentReader. A strong reference would prevent GC to release the key and 
> therefore the value. Therefore GC can't release entries in FC's WeakHashMap. 
> The JIRA is to propose a solution to remove the strong reference mentioned 
> above so that GC can release FC entries to avoid long GC pause or OOM. It 
> needs to be optional because this change is a tradeoff, trading more CPU 
> cycles for low memory footage. User can make final decision depending on 
> their use cases.
> The prototype attached use a combination of directory name and segment name 
> as key to the WeakHashMap, replacing the SegmentCoreReaders. Without change, 
> Solr doesn't release any FC entries after a GC is manually triggered. With 
> the change, FC entries are usually released after GC.
> However, I am not sure if it's the best way to solve this problem. Any 
> suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2017-01-11 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818106#comment-15818106
 ] 

Yago Riveiro commented on SOLR-8589:


Any progress on this issue?

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch, SOLR-8589.patch, SOLR-8589.patch, 
> SOLR-8589.patch, solr-8589-new-list-details-aliases.png
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9835) Create another replication mode for SolrCloud

2017-01-10 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816535#comment-15816535
 ] 

Yago Riveiro commented on SOLR-9835:


bq. how about getLiveReplicasCount() ?

If I'm reading the code and found a method called getLiveReplicasCount(), I 
expected that return the number of live replicas for a shard, and if the only 
value that can return is 1 for onlyLeaderIndexes and -1 for the rest is not a 
good name.

Something like: 
{{zkStateReader.getClusterState().getCollection(collection).getReplicationMode()}}
 that returns an enum(ONLY_LEADER_INDEXES, ALL_REPLICAS_INDEXES) or something 
like that.



> Create another replication mode for SolrCloud
> -
>
> Key: SOLR-9835
> URL: https://issues.apache.org/jira/browse/SOLR-9835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch, 
> SOLR-9835.patch, SOLR-9835.patch, SOLR-9835.patch
>
>
> The current replication mechanism of SolrCloud is called state machine, which 
> replicas start in same initial state and for each input, the input is 
> distributed across replicas so all replicas will end up with same next state. 
> But this type of replication have some drawbacks
> - The commit (which costly) have to run on all replicas
> - Slow recovery, because if replica miss more than N updates on its down 
> time, the replica have to download entire index from its leader.
> So we create create another replication mode for SolrCloud called state 
> transfer, which acts like master/slave replication. In basically
> - Leader distribute the update to other replicas, but the leader only apply 
> the update to IW, other replicas just store the update to UpdateLog (act like 
> replication).
> - Replicas frequently polling the latest segments from leader.
> Pros:
> - Lightweight for indexing, because only leader are running the commit, 
> updates.
> - Very fast recovery, replicas just have to download the missing segments.
> To use this new replication mode, a new collection must be created with an 
> additional parameter {{liveReplicas=1}}
> {code}
> http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&numShards=2&replicationFactor=1&liveReplicas=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9951) FileAlreadyExistsException on replication.properties

2017-01-10 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814797#comment-15814797
 ] 

Yago Riveiro commented on SOLR-9951:


This is not a duplicated of 
[SOLR-9859|https://issues.apache.org/jira/browse/SOLR-9859]?

> FileAlreadyExistsException on replication.properties
> 
>
> Key: SOLR-9951
> URL: https://issues.apache.org/jira/browse/SOLR-9951
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.3
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: master (7.0), 6.4
>
>
> Just spotted this one right after restarting two nodes. Only one node logged 
> the error. It's a single shard with two replica's. The exception was logged 
> for all three active cores:
> {code}
> java.nio.file.FileAlreadyExistsException: 
> /var/lib/solr/core_shard1_replica1/data/replication.properties
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>   at 
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
>   at java.nio.file.Files.newOutputStream(Files.java:216)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:413)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:409)
>   at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
>   at 
> org.apache.lucene.store.NRTCachingDirectory.createOutput(NRTCachingDirectory.java:157)
>   at 
> org.apache.solr.handler.IndexFetcher.logReplicationTimeAndConfFiles(IndexFetcher.java:675)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:487)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:156)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:408)
>   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8362) Add docValues support for TextField

2016-12-30 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15788494#comment-15788494
 ] 

Yago Riveiro commented on SOLR-8362:


Streams only works with fields that have configured docValues. As TextField 
doesn't support docValues I had think that maybe if the field type had 
docValues the streams would work.

We want the stored value instead, your explanation makes sense :)

> Add docValues support for TextField
> ---
>
> Key: SOLR-8362
> URL: https://issues.apache.org/jira/browse/SOLR-8362
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>
> At the last lucene/solr revolution, Toke asked a question about why TextField 
> doesn't support docValues.  The short answer is because no one ever added it, 
> but the longer answer was because we would have to think through carefully 
> the _intent_ of supporting docValues for  a "tokenized" field like TextField, 
> and how to support various conflicting usecases where they could be handy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8362) Add docValues support for TextField

2016-12-30 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15788348#comment-15788348
 ] 

Yago Riveiro commented on SOLR-8362:


Without support to DocValues to text fields, reindex a collection using the 
Update Stream Decorator it's not possible also.

Streams are great to reindex data with a decent throughput. 

> Add docValues support for TextField
> ---
>
> Key: SOLR-8362
> URL: https://issues.apache.org/jira/browse/SOLR-8362
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>
> At the last lucene/solr revolution, Toke asked a question about why TextField 
> doesn't support docValues.  The short answer is because no one ever added it, 
> but the longer answer was because we would have to think through carefully 
> the _intent_ of supporting docValues for  a "tokenized" field like TextField, 
> and how to support various conflicting usecases where they could be handy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-12-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15782991#comment-15782991
 ] 

Yago Riveiro commented on SOLR-9241:


Issue SOLR-9322, the RESHARD command.

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-12-27 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780767#comment-15780767
 ] 

Yago Riveiro commented on SOLR-9241:


Any progress on this?

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9880) Add Ganglia and Graphite metrics reporters

2016-12-22 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769533#comment-15769533
 ] 

Yago Riveiro commented on SOLR-9880:


With a file-based report I can do a wrapper to read the file, that is ok and 
for integrations I think it's the better way, plain-text is unix friendly :)

+1 to have a file-based report.

> Add Ganglia and Graphite metrics reporters
> --
>
> Key: SOLR-9880
> URL: https://issues.apache.org/jira/browse/SOLR-9880
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Minor
> Fix For: master (7.0), 6.4
>
>
> Originally SOLR-4735 provided implementations for these reporters (wrappers 
> for Dropwizard components to use {{SolrMetricReporter}} API).
> However, this functionality has been split into its own issue due to the 
> additional transitive dependencies that these reporters bring:
> * Ganglia:
> ** metrics-ganglia, ASL, 3kB
> ** gmetric4j (Ganglia RPC implementation), BSD, 29kB
> * Graphite
> ** metrics-graphite, ASL, 10kB
> ** amqp-client (RabbitMQ Java client, marked optional in pom?), ASL/MIT/GPL2, 
> 190kB
> IMHO these are not very large dependencies, and given the useful 
> functionality they provide it's worth adding them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9880) Add Ganglia and Graphite metrics reporters

2016-12-21 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15767100#comment-15767100
 ] 

Yago Riveiro commented on SOLR-9880:


Any chance to add too the metrics-zabbix report?

https://github.com/hengyunabc/metrics-zabbix

> Add Ganglia and Graphite metrics reporters
> --
>
> Key: SOLR-9880
> URL: https://issues.apache.org/jira/browse/SOLR-9880
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Minor
> Fix For: master (7.0), 6.4
>
>
> Originally SOLR-4735 provided implementations for these reporters (wrappers 
> for Dropwizard components to use {{SolrMetricReporter}} API).
> However, this functionality has been split into its own issue due to the 
> additional transitive dependencies that these reporters bring:
> * Ganglia:
> ** metrics-ganglia, ASL, 3kB
> ** gmetric4j (Ganglia RPC implementation), BSD, 29kB
> * Graphite
> ** metrics-graphite, ASL, 10kB
> ** amqp-client (RabbitMQ Java client, marked optional in pom?), ASL/MIT/GPL2, 
> 190kB
> IMHO these are not very large dependencies, and given the useful 
> functionality they provide it's worth adding them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9882) ClassCastException: BasicResultContext cannot be cast to SolrDocumentList

2016-12-20 Thread Yago Riveiro (JIRA)

Yago Riveiro created SOLR-9882:
--

 Summary: ClassCastException: BasicResultContext cannot be cast to 
SolrDocumentList
 Key: SOLR-9882
 URL: https://issues.apache.org/jira/browse/SOLR-9882
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 6.3
Reporter: Yago Riveiro


After talk with [~yo...@apache.org] in the mailing list I open this Jira ticket

I'm hitting this bug in Solr 6.3.0.

null:java.lang.ClassCastException:
org.apache.solr.response.BasicResultContext cannot be cast to
org.apache.solr.common.SolrDocumentList
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3274) ZooKeeper related SolrCloud problems

2016-12-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764336#comment-15764336
 ] 

Yago Riveiro commented on SOLR-3274:


I hitting this in 6.3.0 a lot and I don't know why, my TTL for zookeeper is 
120s and I had no log into the gc log with pauses higher than 100ms

Exists some configuration to see the reason for the failure talking with 
ZooKeeper? like connection timeout or something else?

org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - Updates are 
disabled.
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.zkCheck(DistributedUpdateProcessor.java:1508)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:696)
at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:275)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:240)
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:158)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)

> ZooKeeper related SolrCloud problems
> 
>
>

[jira] [Comment Edited] (SOLR-9580) Exception while updating statistics

2016-12-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742756#comment-15742756
 ] 

Yago Riveiro edited comment on SOLR-9580 at 12/12/16 6:43 PM:
--

I'm hitting the same bug with version 6.3.0


was (Author: yriveiro):
I'm running into the same bug with version 6.3.0

> Exception while updating statistics
> ---
>
> Key: SOLR-9580
> URL: https://issues.apache.org/jira/browse/SOLR-9580
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Chris de Kok
>
> The replication throws a warning after the 2nd time the replicaiton occurs 
> complaining about that te replication.properties already exists.
> WARN true
> IndexFetcher
> Exception while updating statistics
> java.nio.file.FileAlreadyExistsException: 
> /var/local/solr/cores/data/replication.properties
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>   at 
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
>   at java.nio.file.Files.newOutputStream(Files.java:216)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:413)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:409)
>   at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
>   at 
> org.apache.lucene.store.NRTCachingDirectory.createOutput(NRTCachingDirectory.java:157)
>   at 
> org.apache.solr.handler.IndexFetcher.logReplicationTimeAndConfFiles(IndexFetcher.java:681)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:493)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:254)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
>   at 
> org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$2(ReplicationHandler.java:1145)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9580) Exception while updating statistics

2016-12-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742756#comment-15742756
 ] 

Yago Riveiro commented on SOLR-9580:


I'm running into the same bug with version 6.3.0

> Exception while updating statistics
> ---
>
> Key: SOLR-9580
> URL: https://issues.apache.org/jira/browse/SOLR-9580
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Chris de Kok
>
> The replication throws a warning after the 2nd time the replicaiton occurs 
> complaining about that te replication.properties already exists.
> WARN true
> IndexFetcher
> Exception while updating statistics
> java.nio.file.FileAlreadyExistsException: 
> /var/local/solr/cores/data/replication.properties
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>   at 
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
>   at java.nio.file.Files.newOutputStream(Files.java:216)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:413)
>   at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:409)
>   at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
>   at 
> org.apache.lucene.store.NRTCachingDirectory.createOutput(NRTCachingDirectory.java:157)
>   at 
> org.apache.solr.handler.IndexFetcher.logReplicationTimeAndConfFiles(IndexFetcher.java:681)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:493)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:254)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:397)
>   at 
> org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$2(ReplicationHandler.java:1145)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5894) Speed up high-cardinality facets with sparse counters

2016-12-05 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15722603#comment-15722603
 ] 

Yago Riveiro commented on SOLR-5894:


Are facets with sparse counters faster that current JSON facets?

> Speed up high-cardinality facets with sparse counters
> -
>
> Key: SOLR-5894
> URL: https://issues.apache.org/jira/browse/SOLR-5894
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 4.7.1
>Reporter: Toke Eskildsen
>Priority: Minor
>  Labels: faceted-search, faceting, memory, performance
> Attachments: SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> SOLR-5894_test.zip, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> author_7M_tags_1852_logged_queries_warmed.png, 
> sparse_200docs_fc_cutoff_20140403-145412.png, 
> sparse_500docs_20140331-151918_multi.png, 
> sparse_500docs_20140331-151918_single.png, 
> sparse_5051docs_20140328-152807.png
>
>
> Multiple performance enhancements to Solr String faceting.
> * Sparse counters, switching the constant time overhead of extracting top-X 
> terms with time overhead linear to result set size
> * Counter re-use for reduced garbage collection and lower per-call overhead
> * Optional counter packing, trading speed for space
> * Improved distribution count logic, greatly improving the performance of 
> distributed faceting
> * In-segment threaded faceting
> * Regexp based white- and black-listing of facet terms
> * Heuristic faceting for large result sets
> Currently implemented for Solr 4.10. Source, detailed description and 
> directly usable WAR at http://tokee.github.io/lucene-solr/
> This project has grown beyond a simple patch and will require a fair amount 
> of co-operation with a committer to get into Solr. Splitting into smaller 
> issues is a possibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9818) Solr admin UI rapidly retries any request(s) if it loses connection with the server

2016-12-02 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15714706#comment-15714706
 ] 

Yago Riveiro commented on SOLR-9818:


This problem is critical when we use the UI to create replicas, last time I did 
the operation and the cluster was busy, the result was 23 new replicas for my 
shard ...

> Solr admin UI rapidly retries any request(s) if it loses connection with the 
> server
> ---
>
> Key: SOLR-9818
> URL: https://issues.apache.org/jira/browse/SOLR-9818
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: web gui
>Affects Versions: 6.3
>Reporter: Ere Maijala
>
> It seems that whenever the Solr admin UI loses connection with the server, be 
> the reason that the server is too slow to answer or that it's gone away 
> completely, it starts hammering the server with the previous request until it 
> gets a success response, it seems. That can be especially bad if the last 
> attempted action was something like collection reload with a SolrCloud 
> instance. The admin UI will quickly add hundreds of reload commands to 
> overseer/collection-queue-work, which may essentially cause the replicas to 
> get overloaded when they're trying to handle all the reload commands.
> I believe the UI should never retry the previous command blindly when the 
> connection is lost, but instead just ping the server until it responds again.
> Steps to reproduce:
> 1.) Fire up Solr
> 2.) Open the admin UI in browser
> 3.) Open a web console in the browser to see the requests it sends
> 4.) Stop solr
> 5.) Try an action in the admin UI
> 6.) Observe the web console in browser quickly fill up with repeats of the 
> originally attempted request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419184#comment-15419184
 ] 

Yago Riveiro commented on SOLR-8586:


Then I do not understand, how this is possible:

https://www.dropbox.com/s/a6e2wrmedop7xjv/Screenshot%202016-08-12%2018.19.22.png?dl=0

Only with 5.5.x and 6.x the heap grows to the infinite. Rolling back to 5.4 the 
amount of memory needed to become up is constant ...

With only one node running 5.5.x I have no problems, when I start a second node 
with 5.5.x they never pass the phase where they are checking replica 
synchronization.

> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419078#comment-15419078
 ] 

Yago Riveiro edited comment on SOLR-8586 at 8/12/16 4:23 PM:
-

My index has 12T of data indexed with 4.0, the _version_ field only supports 
docValues since 4.7.

To Upgrade to 5.x I ran the lucene-core-5.x over all my data,but with this new 
feature I need to re-index all my data because I don't have docValues for 
__version__ field and this feature use instead the un-inverted method that 
creates a memory struct that doesn't fit the memory of my servers ...

To be honest, this never should be done in a minor release ... this mandatory 
feature is based in a optional configuration :/

I will die in 5.4 or spend several months re-indexing data and figure out how 
to update production without downtime.  Not an easy task.




was (Author: yriveiro):
My index has 12T of data indexed with 4.0, the _version_ field only support 
docValues since 4.7.

To Upgrade to 5.x I ran the lucene-core-5.x over all my data,but with this new 
feature I need to re-index all my data because I don't have docValues for 
__version__ field and this feature use instead the un-inverted method that 
creates a memory struct that doesn't fit the memory of my servers ...

To be honest, this never should be done in a minor release ... this mandatory 
feature is based in a optional configuration :/

I will die in 5.4 or spend several months re-indexing data and figure out how 
to update production without downtime.  Not an easy task.



> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419078#comment-15419078
 ] 

Yago Riveiro commented on SOLR-8586:


My index has 12T of data indexed with 4.0, the _version_ field only support 
docValues since 4.7.

To Upgrade to 5.x I ran the lucene-core-5.x over all my data,but with this new 
feature I need to re-index all my data because I don't have docValues for 
__version__ field and this feature use instead the un-inverted method that 
creates a memory struct that doesn't fit the memory of my servers ...

To be honest, this never should be done in a minor release ... this mandatory 
feature is based in a optional configuration :/

I will die in 5.4 or spend several months re-indexing data and figure out how 
to update production without downtime.  Not an easy task.



> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-08-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418672#comment-15418672
 ] 

Yago Riveiro commented on SOLR-9241:


This feature will be released in 6.x branch or will be a 7.x feature?

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8586) Implement hash over all documents to check for shard synchronization

2016-08-11 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417460#comment-15417460
 ] 

Yago Riveiro commented on SOLR-8586:


Is this operation memory bound?

I'm trying to update my SolrCloud from 5.4 to 5.5.2 and I can only update one 
node, if I start another node with 5.5.2 the first dies with an OOM.

The second node never pass the phase where is checking if replicas are sync.

The SolrCloud deploy (2 nodes) has no activity at all, is a cold repository for 
archived data (around 5 Billion documents).



> Implement hash over all documents to check for shard synchronization
> 
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch, 
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should 
> suffice.  The hash itself is pretty easy, but we need to figure out 
> when/where to do this check (for example, I think PeerSync is currently used 
> in multiple contexts and this check would perhaps not be appropriate for all 
> PeerSync calls?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4586) Eliminate the maxBooleanClauses limit

2016-08-03 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406072#comment-15406072
 ] 

Yago Riveiro commented on SOLR-4586:


This parameter should be unlimited by default, if the user wants a limit, it's 
user responsibility to set a limit.

I hit this limit several times, and it's illogical since If I have resources to 
do a 10K boolean clause, why Can't I do it without tweak some weird param?

+1

> Eliminate the maxBooleanClauses limit
> -
>
> Key: SOLR-4586
> URL: https://issues.apache.org/jira/browse/SOLR-4586
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.2
> Environment: 4.3-SNAPSHOT 1456767M - ncindex - 2013-03-15 13:11:50
>Reporter: Shawn Heisey
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-4586.patch, SOLR-4586.patch, SOLR-4586.patch, 
> SOLR-4586.patch, SOLR-4586.patch, SOLR-4586.patch, 
> SOLR-4586_verify_maxClauses.patch
>
>
> In the #solr IRC channel, I mentioned the maxBooleanClauses limitation to 
> someone asking a question about queries.  Mark Miller told me that 
> maxBooleanClauses no longer applies, that the limitation was removed from 
> Lucene sometime in the 3.x series.  The config still shows up in the example 
> even in the just-released 4.2.
> Checking through the source code, I found that the config option is parsed 
> and the value stored in objects, but does not actually seem to be used by 
> anything.  I removed every trace of it that I could find, and all tests still 
> pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6399) Implement unloadCollection in the Collections API

2016-07-26 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394033#comment-15394033
 ] 

Yago Riveiro edited comment on SOLR-6399 at 7/26/16 4:24 PM:
-

With a flag in the core.properties saying that the core is not loaded on 
startup, a new state in zookeeper saying that collection is unloaded to not 
route queries and not trigger recoveries or notify that collection is down, and 
a command to load the collection on demand It's enough.

I don't want to do a backup with a restore, I want notify the cluster to not 
load data to memory to save resources, but if necessary loading the collection 
on the fly.

Backup data involve extra space somewhere, with 1T collection you needs 1T in 
other location to backup, to say nothing of transfer data over the network ...

Backup and restore is a nice feature, but in huge clusters with a lot of data 
you not always can do it without huge amount of resources.


was (Author: yriveiro):
With a flag in the core.properties saying that the core is not loaded on 
startup, a new state in zookeeper saying that collection is unloaded to not 
route queries and not trigger recoveries or notify that collection is down, and 
a command to load the collection on demand It's enough.

I don't want to do a backup with a restore, I want notify the cluster to not 
load data to memory to save resources, but if necessary loading the collection 
on the fly.

Backup data involve extra space somewhere, with 1T collection you needs 1T in 
other location to backup, to say nothing of transfer data over the network ...

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: 6.0
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6399) Implement unloadCollection in the Collections API

2016-07-26 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394033#comment-15394033
 ] 

Yago Riveiro commented on SOLR-6399:


With a flag in the core.properties saying that the core is not loaded on 
startup, a new state in zookeeper saying that collection is unloaded to not 
route queries and not trigger recoveries or notify that collection is down, and 
a command to load the collection on demand It's enough.

I don't want to do a backup with a restore, I want notify the cluster to not 
load data to memory to save resources, but if necessary loading the collection 
on the fly.

Backup data involve extra space somewhere, with 1T collection you needs 1T in 
other location to backup, to say nothing of transfer data over the network ...

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: 6.0
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-07-21 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387792#comment-15387792
 ] 

Yago Riveiro commented on SOLR-9241:


I have one collection with 6 shards, 200G each (1.2T in total), hypothetically 
using this API I want transform it in a 12 shards collection, my concern is if 
this API will get the job done or will fail.

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-07-21 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387498#comment-15387498
 ] 

Yago Riveiro commented on SOLR-9241:


This will work with shards with 300G? The actual SPLITSHARD command never ends 
successfully in my case :(

Other important thing to be in mind:
- The operation can take 1 week if necessary, but can't crash the cluster ...
- The level of resource allocated to this task should be configurable, I don't 
know how, but something like maximum memory and threads to do the task

P.S: This feature is like ... awesome :D 

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8873) Enforce dataDir/instanceDir/ulogDir to be paths that contain only a controlled subset of characters

2016-04-04 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15225253#comment-15225253
 ] 

Yago Riveiro commented on SOLR-8873:


Restrict choices without any evidence of issues is halfway to people start 
questioning (like me) why the enforcing was done. 

> Enforce dataDir/instanceDir/ulogDir to be paths that contain only a 
> controlled subset of characters
> ---
>
> Key: SOLR-8873
> URL: https://issues.apache.org/jira/browse/SOLR-8873
> Project: Solr
>  Issue Type: Improvement
>Reporter: Tomás Fernández Löbbe
> Attachments: SOLR-8873.patch
>
>
> We currently support any valid path for dataDir/instanceDir/ulogDir. I think 
> we should prevent special characters and restrict to a subset that is 
> commonly used and tested.
> My initial proposals it to allow the Java pattern: 
> {code:java}"^[a-zA-Z0-9\\.\\ \\-_/\"':]+$"{code} but I'm open to 
> suggestions. I'm not sure if there can be issues with HDFS paths (this 
> pattern does pass the tests we currently have), or some other use case I'm 
> not considering.
> I also think our tests should use all those characters randomly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8741) Json Facet API, numBuckets not returning real number of buckets.

2016-03-30 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217835#comment-15217835
 ] 

Yago Riveiro commented on SOLR-8741:


I hit this bug too.

If I discard the docs that do not have the field that throws the NPE (q=field:* 
to fetch only docs with values) the hll doesn't throws the NPE  

> Json Facet API, numBuckets not returning real number of buckets.
> 
>
> Key: SOLR-8741
> URL: https://issues.apache.org/jira/browse/SOLR-8741
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module
>Reporter: Pablo Anzorena
>
> Hi, using the json facet api I realized that the numBuckets is wrong. It is 
> not returning the right number of buckets. I have a dimension which 
> numBuckets says it has 1340, but when retrieving all the results it brings 
> 988. 
> FYI the field is of type string.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215025#comment-15215025
 ] 

Yago Riveiro edited comment on SOLR-8642 at 3/28/16 11:09 PM:
--

This can be used to learn something that I did with linux some time ago. When 
we releases and API, we release legacy, because people will develop a codebase 
using it (this include the wrong behaviours).

If the API is broken, people like me will be in troubles. This is the reason to 
see system calls with the same name and a number in the end and are deprecated 
like 10 years later.

Improvements are good, And I believe that this is doing for a good reason, but 
without tools that allow people to migrate from older behaviours are not useful.

Solr should have an LTS version, or at least don't introduce BC in a major 
release. It's not the first time that I pass for this situation, and every time 
that I need to explain to my boss that something is broken in our current 
version but we can't upgrade because other thing is broken in next version, I 
feel his assassin instinct :p 

Annoying level to 9997



was (Author: yriveiro):
This can be use to learn something that I did with linux some time ago. When we 
releases and API, we release legacy, because people will develop a codebase 
using it (this include the wrong behaviours).

If the API is broken, people like me will be in troubles. This is the reason to 
see system calls with the same name and a number in the end and are deprecated 
like 10 years later.

Improvements are good, And I believe that this is doing for a good reason, but 
without tools that allow people to migrate from older behaviours are not useful.

Solr should have an LTS version, or at least don't introduce BC in a major 
release. It's not the first time that I pass for this situation, and every time 
that I need to explain to my boss that something is broken in our current 
version but we can't upgrade because other thing is broken in next version, I 
feel his assassin instinct :p 

Annoying level to 9997


> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE&name=getting+started&numShards=2&replicationFactor=2&maxShardsPerNode=2&collection.configName=gettingstarted";
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/

[jira] [Commented] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215025#comment-15215025
 ] 

Yago Riveiro commented on SOLR-8642:


This can be use to learn something that I did with linux some time ago. When we 
releases and API, we release legacy, because people will develop a codebase 
using it (this include the wrong behaviours).

If the API is broken, people like me will be in troubles. This is the reason to 
see system calls with the same name and a number in the end and are deprecated 
like 10 years later.

Improvements are good, And I believe that this is doing for a good reason, but 
without tools that allow people to migrate from older behaviours are not useful.

Solr should have an LTS version, or at least don't introduce BC in a major 
release. It's not the first time that I pass for this situation, and every time 
that I need to explain to my boss that something is broken in our current 
version but we can't upgrade because other thing is broken in next version, I 
feel his assassin instinct :p 

Annoying level to 9997


> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE&name=getting+started&numShards=2&replicationFactor=2&maxShardsPerNode=2&collection.configName=gettingstarted";
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&wt=json&indent=true";
> HTTP/1.1 200 OK
> Content-Type: application/json; charset=UTF-8
> Transfer-Encoding: chunked
> {
>   "responseHeader":{
> "status":0,
> "QTime":6},
>   "cluster":{
> "collections":{
>  ...
>   "getting started":{
> "replicationFactor":"2",
> "shards":{
>   "shard1":{
> "range":"8000-",
> "state&quo

[jira] [Commented] (SOLR-8110) Start enforcing field naming recomendations in next X.0 release?

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214992#comment-15214992
 ] 

Yago Riveiro commented on SOLR-8110:


My bad. The issue was pointed in the IRC as the  actual place of discussion 
about name enforcing 

This issue is about the schema fields and not the one that enforce collection 
name.

> Start enforcing field naming recomendations in next X.0 release?
> 
>
> Key: SOLR-8110
> URL: https://issues.apache.org/jira/browse/SOLR-8110
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-8110.patch, SOLR-8110.patch
>
>
> For a very long time now, Solr has made the following "recommendation" 
> regarding field naming conventions...
> bq. field names should consist of alphanumeric or underscore characters only 
> and not start with a digit.  This is not currently strictly enforced, but 
> other field names will not have first class support from all components and 
> back compatibility is not guaranteed.  ...
> I'm opening this issue to track discussion about if/how we should start 
> enforcing this as a rule instead (instead of just a "recommendation") in our 
> next/future X.0 (ie: major) release.
> The goals of doing so being:
> * simplify some existing code/apis that currently use hueristics to deal with 
> lists of field and produce strange errors when the huerstic fails (example: 
> ReturnFields.add)
> * reduce confusion/pain for new users who might start out unaware of the 
> recommended conventions and then only later encountering a situation where 
> their field names are not supported by some feature and get frustrated 
> because they have to change their schema, reindex, update index/query client 
> expectations, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8110) Start enforcing field naming recomendations in next X.0 release?

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214325#comment-15214325
 ] 

Yago Riveiro edited comment on SOLR-8110 at 3/28/16 3:39 PM:
-

This enforcing shouldn't happen without an API to rename collections ... and 
don't not forget that there are people with indexes with terabytes of data that 
can't do a full re-index


was (Author: yriveiro):
This enforcing shouldn't happen without an API to rename collections ... and 
don't not forget that there is people with indexes with terabytes of data that 
can't do a full re-index

> Start enforcing field naming recomendations in next X.0 release?
> 
>
> Key: SOLR-8110
> URL: https://issues.apache.org/jira/browse/SOLR-8110
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-8110.patch, SOLR-8110.patch
>
>
> For a very long time now, Solr has made the following "recommendation" 
> regarding field naming conventions...
> bq. field names should consist of alphanumeric or underscore characters only 
> and not start with a digit.  This is not currently strictly enforced, but 
> other field names will not have first class support from all components and 
> back compatibility is not guaranteed.  ...
> I'm opening this issue to track discussion about if/how we should start 
> enforcing this as a rule instead (instead of just a "recommendation") in our 
> next/future X.0 (ie: major) release.
> The goals of doing so being:
> * simplify some existing code/apis that currently use hueristics to deal with 
> lists of field and produce strange errors when the huerstic fails (example: 
> ReturnFields.add)
> * reduce confusion/pain for new users who might start out unaware of the 
> recommended conventions and then only later encountering a situation where 
> their field names are not supported by some feature and get frustrated 
> because they have to change their schema, reindex, update index/query client 
> expectations, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8110) Start enforcing field naming recomendations in next X.0 release?

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214325#comment-15214325
 ] 

Yago Riveiro commented on SOLR-8110:


This enforcing shouldn't happen without an API to rename collections ... and 
don't not forget that there is people with indexes with terabytes of data that 
can't do a full re-index

> Start enforcing field naming recomendations in next X.0 release?
> 
>
> Key: SOLR-8110
> URL: https://issues.apache.org/jira/browse/SOLR-8110
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-8110.patch, SOLR-8110.patch
>
>
> For a very long time now, Solr has made the following "recommendation" 
> regarding field naming conventions...
> bq. field names should consist of alphanumeric or underscore characters only 
> and not start with a digit.  This is not currently strictly enforced, but 
> other field names will not have first class support from all components and 
> back compatibility is not guaranteed.  ...
> I'm opening this issue to track discussion about if/how we should start 
> enforcing this as a rule instead (instead of just a "recommendation") in our 
> next/future X.0 (ie: major) release.
> The goals of doing so being:
> * simplify some existing code/apis that currently use hueristics to deal with 
> lists of field and produce strange errors when the huerstic fails (example: 
> ReturnFields.add)
> * reduce confusion/pain for new users who might start out unaware of the 
> recommended conventions and then only later encountering a situation where 
> their field names are not supported by some feature and get frustrated 
> because they have to change their schema, reindex, update index/query client 
> expectations, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214269#comment-15214269
 ] 

Yago Riveiro edited comment on SOLR-8642 at 3/28/16 3:11 PM:
-

Hi,

I can't believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This kind of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .


was (Author: yriveiro):
Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This kind of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .

> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE&name=getting+started&numShards=2&replicationFactor=2&maxShardsPerNode=2&collection.configName=gettingstarted";
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from serv

[jira] [Comment Edited] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214269#comment-15214269
 ] 

Yago Riveiro edited comment on SOLR-8642 at 3/28/16 3:05 PM:
-

Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This kind of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .


was (Author: yriveiro):
Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This can of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .

> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE&name=getting+started&numShards=2&replicationFactor=2&maxShardsPerNode=2&collection.configName=gettingstarted";
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at

[jira] [Commented] (SOLR-8642) SOLR allows creation of collections with invalid names

2016-03-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214269#comment-15214269
 ] 

Yago Riveiro commented on SOLR-8642:


Hi,

I can believe that I can't use a hyphen to create my collections ... I have 
thousand of collection with hyphens and basically I have a automatic system 
that creates the collections on the fly, and codebase that relay in collection 
names.

Sorry but this change can't be done without a API that allow rename a 
collection.

I can't upgrade to 5.5 because I can't create collections. This can of changes 
can't go in the middle of a major release. This enforcing should be optional.

In 4.x someone decides that DocValues in disk doesn't make sense and deprecated 
it in the middle of a major release, 10T of data to optimize to wipe the Disk 
format to use de "default" and 3 month to do it without downtime. Now I can 
create collections because someone "decides" that hyphens are not allowed. (I 
use Solr since 3.x, no problems with hyphens).

Sorry but this is annoying level .

> SOLR allows creation of collections with invalid names
> --
>
> Key: SOLR-8642
> URL: https://issues.apache.org/jira/browse/SOLR-8642
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: master
>Reporter: Jason Gerlowski
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 5.5, master
>
> Attachments: SOLR-8642.patch, SOLR-8642.patch, SOLR-8642.patch, 
> SOLR-8642.patch
>
>
> Some of my colleagues and I recently noticed that the CREATECOLLECTION API 
> will create a collection even when invalid characters are present in the name.
> For example, consider the following reproduction case, which involves 
> creating a collection with a space in its name:
> {code}
> $ 
> $ bin/solr start -e cloud -noprompt
> ...
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CREATE&name=getting+started&numShards=2&replicationFactor=2&maxShardsPerNode=2&collection.configName=gettingstarted";
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
> 
> 
> 0 name="QTime">299 name="failure">org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica2': Unable to create core [getting 
> started_shard2_replica2] Caused by: Invalid core name: 'getting 
> started_shard2_replica2' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard2_replica1': Unable to create core [getting 
> started_shard2_replica1] Caused by: Invalid core name: 'getting 
> started_shard2_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:7574/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica1': Unable to create core [getting 
> started_shard1_replica1] Caused by: Invalid core name: 'getting 
> started_shard1_replica1' Names must consist entirely of periods, underscores 
> and 
> alphanumericsorg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://127.0.1.1:8983/solr: Error CREATEing SolrCore 'getting 
> started_shard1_replica2': Unable to create core [getting 
> started_shard1_replica2] Caused by: Invalid core name: 'getting 
> started_shard1_replica2' Names must consist entirely of periods, underscores 
> and alphanumerics
> 
> $ 
> $ curl -i -l -k -X GET 
> "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&wt=json&indent=true";
> HTTP/1.1 200 OK
> Content-Type: application/json; charset=UTF-8
> Transfer-Encoding: chunked
> {
>   "responseHeader":{
> "status":0,
> "QTime":6},
>   "cluster":{
> "collections":{
>  ...
>   "getting started":{
> "replicationFactor":"2",
> "shards":{
>   "shard1":{
> "range":"8000-",
> &

[jira] [Commented] (SOLR-7452) json facet api returning inconsistent counts in cloud set up

2016-03-15 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195024#comment-15195024
 ] 

Yago Riveiro commented on SOLR-7452:


[~yo...@apache.org] This feature is important to have accurate information.

Any chance to have this before 6.0 be released?

P.S: The documentation doesn't inform about this situation, that is a big 
downside in some scenarios.

> json facet api returning inconsistent counts in cloud set up
> 
>
> Key: SOLR-7452
> URL: https://issues.apache.org/jira/browse/SOLR-7452
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module
>Affects Versions: 5.1
>Reporter: Vamsi Krishna D
>  Labels: count, facet, sort
> Fix For: 5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> While using the newly added feature of json term facet api 
> (http://yonik.com/json-facet-api/#TermsFacet) I am encountering inconsistent 
> returns of counts of faceted value ( Note I am running on a cloud mode of 
> solr). For example consider that i have txns_id(unique field or key), 
> consumer_number and amount. Now for a 10 million such records , lets say i 
> query for 
> q=*:*&rows=0&
>  json.facet={
>biskatoo:{
>type : terms,
>field : consumer_number,
>limit : 20,
>   sort : {y:desc},
>   numBuckets : true,
>   facet:{
>y : "sum(amount)"
>}
>}
>  }
> the results are as follows ( some are omitted ):
> "facets":{
> "count":6641277,
> "biskatoo":{
>   "numBuckets":3112708,
>   "buckets":[{
>   "val":"surya",
>   "count":4,
>   "y":2.264506},
>   {
>   "val":"raghu",
>   "COUNT":3,   // capitalised for recognition 
>   "y":1.8},
> {
>   "val":"malli",
>   "count":4,
>   "y":1.78}]}}}
> but if i restrict the query to 
> q=consumer_number:raghu&rows=0&
>  json.facet={
>biskatoo:{
>type : terms,
>field : consumer_number,
>limit : 20,
>   sort : {y:desc},
>   numBuckets : true,
>   facet:{
>y : "sum(amount)"
>}
>}
>  }
> i get :
>   "facets":{
> "count":4,
> "biskatoo":{
>   "numBuckets":1,
>   "buckets":[{
>   "val":"raghu",
>   "COUNT":4,
>   "y":2429708.24}]}}}
> One can see the count results are inconsistent ( and I found many occasions 
> of inconsistencies).
> I have tried the patch https://issues.apache.org/jira/browse/SOLR-7412 but 
> still the issue seems not resolved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6399) Implement unloadCollection in the Collections API

2016-03-08 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184986#comment-15184986
 ] 

Yago Riveiro edited comment on SOLR-6399 at 3/8/16 2:44 PM:


Any chance of this issue see the light of day?

In setups with thousand of collections this feature is very useful to not use 
resources in collections without activity.


was (Author: yriveiro):
Any chance of this issue see the light of day?

In setups with thousand of collections this feature is very useful to not use 
resources in collections with activity.

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.9
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: master
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6399) Implement unloadCollection in the Collections API

2016-03-08 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184986#comment-15184986
 ] 

Yago Riveiro commented on SOLR-6399:


Any chance of this issue see the light of day?

In setups with thousand of collections this feature is very useful to not use 
resources in collections with activity.

> Implement unloadCollection in the Collections API
> -
>
> Key: SOLR-6399
> URL: https://issues.apache.org/jira/browse/SOLR-6399
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.9
>Reporter: dfdeshom
>Assignee: Shalin Shekhar Mangar
> Fix For: master
>
>
> There is currently no way to unload a collection without deleting its 
> contents. There should be a way in the collections API to unload a collection 
> and reload it later, as needed.
> A use case for this is the following: you store logs by day, with each day 
> having its own collection. You are required to store up to 2 years of data, 
> which adds up to 730 collections.  Most of the time, you'll want to have 3 
> days of data loaded for search. Having just 3 collections loaded into memory, 
> instead of 730 will make managing Solr easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8635) Shards don't propagate the document update correctly

2016-02-03 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130531#comment-15130531
 ] 

Yago Riveiro commented on SOLR-8635:


In your solrconfig you have:
{quote}
LUCENE_40 
{quote}
and should be 
{quote}
LUCENE_5.4.1{quote}

> Shards don't propagate the document update correctly
> 
>
> Key: SOLR-8635
> URL: https://issues.apache.org/jira/browse/SOLR-8635
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.4.1
> Environment: - Red Hat Enterprise Linux Server release 5.6 (Tikanga)
> - Oracle jdk1.7.0_79
> - Apache Solr 5.4.1
> - Apache Zookeeper 3.4.6
>Reporter: Alberto Ferrini
>  Labels: shard, solrcloud, update
> Attachments: schema.xml, solrconfig.xml, zoo.cfg
>
>
> I created a SolrCloud infrastructure with 2 shards and 1 leader and 2 
> reaplicas for each shard: Zookeeper is deployed in an external ensemble.
> When I add a new document, or when I delete an existing document, all works 
> correctly.
> But when I update an existent document, the field value is not correctly 
> propagated between the shards, with inconsistency of the index (the query 
> result for that document shows sometimes the new value, sometimes the old 
> value: I see the value because the field is stored).
> Example for the reproduction of the issue:
> - Create document with id "List" and field PATH with value 1 on shard *1*.
> - Query for document (ID:List) -> All OK
> - Create document with id "List" and field PATH with value 2 on shard *2* 
> (document update).
> - Query for document (ID:List) -> Issue: sometimes answers with value 1, 
> sometimes answers with value 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2016-01-26 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117443#comment-15117443
 ] 

Yago Riveiro commented on SOLR-8589:


[~elyograg], as I said before, aliases are related with collections, a new 
command doesn't make sense. An alias of a collection is a virtual collection, 
therefore should be part of LIST command. We share the same opinion.

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch, solr-8589-new-list-details-aliases.png
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2016-01-26 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117356#comment-15117356
 ] 

Yago Riveiro commented on SOLR-8589:


How is exposed aliases list in SOLR-4968?

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch, solr-8589-new-list-details-aliases.png
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8589) Add aliases to the LIST action results in the Collections API

2016-01-23 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113817#comment-15113817
 ] 

Yago Riveiro commented on SOLR-8589:


Aliases are related with collections, if I need to do a HTTP call to get the 
aliases from clusterstatus API I will need to parse a huge structure (with 
thousand of collections) with a lot of noise only to know the aliases ...

IMHO the collection API should return this info if requested on a command LIST. 
Something like aliases=true

> Add aliases to the LIST action results in the Collections API
> -
>
> Key: SOLR-8589
> URL: https://issues.apache.org/jira/browse/SOLR-8589
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Attachments: SOLR-8589.patch
>
>
> Although it is possible to get a list of SolrCloud aliases vi an HTTP API, it 
> is not available as a typical query response, I believe it is only available 
> via the http API for zookeeper.
> The results from the LIST action in the Collections API is well-situated to 
> handle this. The current results are contained in a "collections" node, we 
> can simply add an "aliases" node if there are any aliases defined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested document query with wrong numFound value

2015-12-11 Thread Yago Riveiro

When do you say that I have duplicates, what do you mean? 


If I have duplicate documents is not intentional, each document must be unique.


Running a query for each id:





- Parent :  3181426982318142698228

- Child_1 : 31814269823181426982280

- Child_2 : 31814269823181426982281




The result is one document for each …





responseHeader: 
{
status: 0,



QTime: 3,



params: 
{
q: "id:3181426982318142698228",



fl: "id",



q.op: "AND"



}


},



response: 
{
numFound: 1,



start: 0,



maxScore: 11.017976,



docs: 
[

{
id: "3181426982318142698228"


}

]


}







responseHeader: 
{
status: 0,



QTime: 3,



params: 
{
q: "id:31814269823181426982280",



fl: "id",



q.op: "AND"



}


},



response: 
{
numFound: 1,



start: 0,



maxScore: 9.919363,



docs: 
[

{
id: "31814269823181426982280"


}

]


}






responseHeader: 
{
status: 0,



QTime: 3,



params: 
{
q: "id:31814269823181426982281",



fl: "id",



q.op: "AND"



}


},



response: 
{
numFound: 1,



start: 0,



maxScore: 9.919363,



docs: 
[

{
id: "31814269823181426982281"


}

]


}










—/Yago Riveiro





Ok. I got it. SolrCloud relies on uniqueKey (id) for merging shard results,

but in your examples it doesn't work, because nested documents disables

this. And you have duplicates, which make merge heap mad:


false}

<http://node-01:8983/solr/ecommerce-15_shard1_replica2/,rid=node-01-ecommerce-15_shard1_replica2-1449842438070-0,rows=10,version=2,q=id:3181426982318142698228*,requestPurpose=GET_TOP_IDS,NOW=1449842438070,isShard=true,wt=javabin,debugQuery=false%7D>},response={numFound=11,start=0,maxScore=1.0,docs=[SolrDocument{id=31814269823181426982280,

score=1.0}, SolrDocument{id=31814269823181426982280, score=1.0},

SolrDocument{id=31814269823181426982280, score=1.0},

SolrDocument{id=31814269823181426982280, score=1.0},

SolrDocument{id=31814269823181426982280, score=1.0},

SolrDocument{id=31814269823181426982281, score=1.0},

SolrDocument{id=31814269823181426982281, score=1.0},

SolrDocument{id=31814269823181426982281, score=1.0},

SolrDocument{id=31814269823181426982281, score=1.0},


Yago, you encounter a quite curious fact. Congratulation!

You can only retrieve parent document with SolrCloud, hence use {!parent

..}.. of fq=type:parent.


ccing Devs:

Shouldn't it prosecute ID dupes explicitly? Is it a known feature?



On Fri, Dec 11, 2015 at 5:08 PM, Yago Riveiro 

wrote:


> This:

>

>

>

>

>

> {

>

>

> responseHeader: {

>

>

> status: 0,

>

>

> QTime: 10,

>

>

> params: {

>

>

> q: "id:3181426982318142698228*",

>

>

> debugQuery: "true"

>

>

> }

>

>

> },

>

>

> response: {

>

>

> numFound: 3,

>

>

> start: 0,

>

>

> maxScore: 1,

>

>

> docs: [{

>

>

> id: "31814269823181426982280",

>

>

> child_type: "ecommerce_product",

>

>

> qty: 1,

>

>

> product_price: 49.99

>

>

> }, {

>

>

> id: "31814269823181426982281",

>

>

> child_type: "ecommerce_product",

>

>

> qty: 1,

>

>

> product_price: 139.9

>

>

> }]

>

>

> },

>

>

> debug: {

>

>

> track: {

>

>

> rid:

> "node-01-ecommerce-15_shard1_replica2-1449842438070-0",

>

>

> EXECUTE_QUERY: {

>

>

> http:

> //node-17:8983/solr/ecommerce-15_shard2_replica1/: {

>

>

> QTime: "0",

>

>

> ElapsedTime: "2",

>

>

> RequestPurpose: "GET_TOP_IDS",

>

>

> NumFound: "0",

>

>

> Response:

> "{responseHeader={status=0,QTime=0,params={df=_text_,distrib=false,debug=[false,

> timing, track],qt=/query,fl=[id,

> score],shards.purpose=4,start=0,fsv=true,shard.url=

> http://node-17:8983/solr/ecommerce-15_shard2_replica1/,rid=node-01-ecomm

[jira] [Created] (SOLR-8257) DELETEREPLICA command shouldn't delete de last replica of a shard

2015-11-09 Thread Yago Riveiro (JIRA)

Yago Riveiro created SOLR-8257:
--

 Summary: DELETEREPLICA command shouldn't delete de last replica of 
a shard
 Key: SOLR-8257
 URL: https://issues.apache.org/jira/browse/SOLR-8257
 Project: Solr
  Issue Type: Bug
Reporter: Yago Riveiro
Priority: Minor


The DELETEREPLICA command shouldn't remove the last replica of a shard.

The original thread in the mailing list 
http://lucene.472066.n3.nabble.com/DELETEREPLICA-command-shouldn-t-delete-de-last-replica-of-a-shard-td4239054.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6583) Resuming connection with ZooKeeper causes log replay

2014-12-02 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231304#comment-14231304
 ] 

Yago Riveiro commented on SOLR-6583:


This is happening in Solr 4.6.1 too.

{code}
ERROR - app2 - 2014-12-01 21:30:42.820; org.apache.solr.update.UpdateLog; Error 
inspecting tlog 
tlog{file=/solr/node/collections/collection1_shard2_replica1/data/tlog/tlog.0001284
 refcount=2}
{code}

> Resuming connection with ZooKeeper causes log replay
> 
>
> Key: SOLR-6583
> URL: https://issues.apache.org/jira/browse/SOLR-6583
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.10.1
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 5.0, Trunk
>
>
> If a node is partitioned from ZooKeeper for an extended period of time then 
> upon resuming connection, the node re-registers itself causing 
> recoverFromLog() method to be executed which fails with the following 
> exception:
> {code}
> 8091124 [Thread-71] ERROR org.apache.solr.update.UpdateLog  – Error 
> inspecting tlog 
> tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0009869
>  refcount=2}
> java.nio.channels.ClosedChannelException
> at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
> at 
> org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784)
> at 
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
> at 
> org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
> at java.io.InputStream.read(InputStream.java:101)
> at 
> org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218)
> at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800)
> at org.apache.solr.cloud.ZkController.register(ZkController.java:834)
> at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271)
> at 
> org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)
> 8091125 [Thread-71] ERROR org.apache.solr.update.UpdateLog  – Error 
> inspecting tlog 
> tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0009870
>  refcount=2}
> java.nio.channels.ClosedChannelException
> at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
> at 
> org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784)
> at 
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
> at 
> org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
> at java.io.InputStream.read(InputStream.java:101)
> at 
> org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218)
> at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800)
> at org.apache.solr.cloud.ZkController.register(ZkController.java:834)
> at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271)
> at 
> org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)
> {code}
> This is because the recoverFromLog uses transaction log references that were 
> collected at startup and are no longer valid.
> We shouldn't even be running recoverFromLog code for ZK re-connect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-06-26 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044678#comment-14044678
 ] 

Yago Riveiro commented on SOLR-5473:


[~elyograg] I can wrong, but I think that the 1MB limit is for znode and not 
for ZK database.



> Make one state.json per collection
> --
>
> Key: SOLR-5473
> URL: https://issues.apache.org/jira/browse/SOLR-5473
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log
>
>
> As defined in the parent issue, store the states of each collection under 
> /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039198#comment-14039198
 ] 

Yago Riveiro commented on SOLR-4793:


it's probably that I was tweaked the zkServer file a bit ... :P

> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038970#comment-14038970
 ] 

Yago Riveiro commented on SOLR-4793:


Elaine now is easier to do the debug, you know where the "problem" is :).

Note: I'm using the 3.4.5 version of zookeeper, I don't know if the zkServer.sh 
was changed 

> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038918#comment-14038918
 ] 

Yago Riveiro commented on SOLR-4793:


Indeed, after dive into the zkEnv file, I realised that if the zookeeper-env.sh 
exists, the zookeeper append the configurations to the init command.

> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038898#comment-14038898
 ] 

Yago Riveiro commented on SOLR-4793:


About tocamt's configuration I have the same configuration.

In the case of Zookeeper I have all custom configurations into a file named 
zookeeper-env.sh located into bin/conf folder with this content:

{code}
#!/usr/bin/env bash

ZOO_ENV="-Djute.maxbuffer= 5000"
{code}

> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038874#comment-14038874
 ] 

Yago Riveiro commented on SOLR-4793:


Elaine can you paste the configuration for tomcat and zookeeper that you have 
for the jute.maxbuffer?

> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( > 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038810#comment-14038810
 ] 

Yago Riveiro commented on SOLR-4793:


I think that version 4.8 updates zookeeper version to 3.4.6

If the workaround doesn't work then is serious issue if you have a large number 
of collections and replicas because all metadata about the cluster is into 
clusterstate.json file.

[~ecario], How you notice it that the workaround doesn't work? Have you any 
logs or something? and last question, do you upgrade Solr from 4.7 to 4.8 or is 
a fresh install?




> Solr Cloud can't upload large config files ( > 1MB)  to Zookeeper
> -
>
> Key: SOLR-4793
> URL: https://issues.apache.org/jira/browse/SOLR-4793
> Project: Solr
>  Issue Type: Improvement
>Reporter: Son Nguyen
>
> Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
> Cloud with some large config files, like synonyms.txt.
> Jan Høydahl has a good idea:
> "SolrCloud is designed with an assumption that you should be able to upload 
> your whole disk-based conf folder into ZK, and that you should be able to add 
> an empty Solr node to a cluster and it would download all config from ZK. So 
> immediately a splitting strategy automatically handled by ZkSolresourceLoader 
> for large files could be one way forward, i.e. store synonyms.txt as e.g. 
> __001_synonyms.txt __002_synonyms.txt"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5788) Document update in case of error doesn't returns the error message correctly

2014-02-27 Thread Yago Riveiro (JIRA)

Yago Riveiro created SOLR-5788:
--

 Summary: Document update in case of error doesn't returns the 
error message correctly
 Key: SOLR-5788
 URL: https://issues.apache.org/jira/browse/SOLR-5788
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro


I found a issue when updating a document.

If for any reason the update can't be done, example: the schema doesn't match 
with the incoming doc; the error raise to the user is something like:

{noformat}
curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary 
@doc.json -H 'Content-type:application/json'
{"responseHeader":{"status":400,"QTime":52},"error":{"msg":"Bad 
Request\n\n\n\nrequest: 
http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2F&wt=javabin&version=2","code":400}}
{noformat}

In case that the update was done on the leader, the error message is (IMHO) the 
correct and with valuable info:

{noformat}
curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary 
@doc.json -H 'Content-type:application/json'
{"responseHeader":{"status":400,"QTime":19},"error":{"msg":"ERROR: 
[doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input string: 
\"Direct\","code":400}}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5788) Document update in case of error doesn't return the error message correctly

2014-02-27 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5788:
---

Summary: Document update in case of error doesn't return the error message 
correctly  (was: Document update in case of error doesn't returns the error 
message correctly)

> Document update in case of error doesn't return the error message correctly
> ---
>
> Key: SOLR-5788
> URL: https://issues.apache.org/jira/browse/SOLR-5788
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1
>Reporter: Yago Riveiro
>
> I found a issue when updating a document.
> If for any reason the update can't be done, example: the schema doesn't match 
> with the incoming doc; the error raise to the user is something like:
> {noformat}
> curl 'http://localhost:8983/solr/collection1/update?commit=true' 
> --data-binary @doc.json -H 'Content-type:application/json'
> {"responseHeader":{"status":400,"QTime":52},"error":{"msg":"Bad 
> Request\n\n\n\nrequest: 
> http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2F&wt=javabin&version=2","code":400}}
> {noformat}
> In case that the update was done on the leader, the error message is (IMHO) 
> the correct and with valuable info:
> {noformat}
> curl 'http://localhost:8983/solr/collection1/update?commit=true' 
> --data-binary @doc.json -H 'Content-type:application/json'
> {"responseHeader":{"status":400,"QTime":19},"error":{"msg":"ERROR: 
> [doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input 
> string: \"Direct\","code":400}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5732) NPE trying get stats with statsComponent

2014-02-14 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5732:
---

Description: 
I'm trying to get stats over a field with type solr.TrieDateField

The field is configurated as:
{noformat}
 
{noformat}

Triying to run this query: 
{noformat}
q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]&stats=true&stats.field=datetime
{noformat}

I have this exception: http://apaste.info/dWL0

A printscreen of the field with the flags  
[here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]

I can run a facet search over the field without any problem.

  was:
I'm trying to get stats over a field with type solr.TrieDateField

The field is configurated as:
{noformat}
 
{noformat}

Triying to run this query: 
{noformat}
q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]&stats=true&stats.field=datetime
{noformat}

I have this exception: http://apaste.info/dWL0

A printscreen of the field with the flags  
[here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]


> NPE trying get stats with statsComponent
> 
>
> Key: SOLR-5732
> URL: https://issues.apache.org/jira/browse/SOLR-5732
> Project: Solr
>  Issue Type: Bug
>    Affects Versions: 4.6.1
>Reporter: Yago Riveiro
>
> I'm trying to get stats over a field with type solr.TrieDateField
> The field is configurated as:
> {noformat}
>   positionIncrementGap="0" sortMissingLast="true" omitNorms="true" 
> omitPositions="true" docValuesFormat="Disk"/>
> {noformat}
> Triying to run this query: 
> {noformat}
> q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]&stats=true&stats.field=datetime
> {noformat}
> I have this exception: http://apaste.info/dWL0
> A printscreen of the field with the flags  
> [here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]
> I can run a facet search over the field without any problem.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5732) NPE trying get stats with statsComponent

2014-02-14 Thread Yago Riveiro (JIRA)

Yago Riveiro created SOLR-5732:
--

 Summary: NPE trying get stats with statsComponent
 Key: SOLR-5732
 URL: https://issues.apache.org/jira/browse/SOLR-5732
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro


I'm trying to get stats over a field with type solr.TrieDateField

The field is configurated as:
{noformat}
 
{noformat}

Triying to run this query: 
{noformat}
q=datetime:[2014-01-01T00:00:00Z%20TO%202014-01-01T00:10:00Z]&stats=true&stats.field=datetime
{noformat}

I have this exception: http://apaste.info/dWL0

A printscreen of the field with the flags  
[here|https://www.dropbox.com/s/6suvoipwuunvk25/Screenshot%202014-02-14%2018.07.59.png]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5724) Two node, one shard solr instance intermittently going offline

2014-02-13 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900513#comment-13900513
 ] 

Yago Riveiro commented on SOLR-5724:


I have this issue too, the only way that I found to recover from this was 
restart the nodes.

> Two node, one shard solr instance intermittently going offline 
> ---
>
> Key: SOLR-5724
> URL: https://issues.apache.org/jira/browse/SOLR-5724
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1
> Environment: Ubuntu 12.04.3 LTS, 64 bit,  java version "1.6.0_45"
> Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)
>Reporter: Joseph Duchesne
>
> One server is stuck in state "recovering" while the other is stuck in state 
> "down". After waiting 45 minutes or so for the cluster to recover, the 
> statuses were the same. 
> Log messages on the "recovering" server: (Just the individual errors for 
> brevity, I can provide full stack traces if that is helpful)
> {quote}
> We are not the leader
> ClusterState says we are the leader, but locally we don't think so
> cancelElection did not find election node to remove
> We are not the leader
> No registered leader was found, collection:listsC slice:shard1
> No registered leader was found, collection:listsC slice:shard1
> {quote}
> On the "down" server at the same timeframe:
> {quote}
> org.apache.solr.common.SolrException; forwarding update to 
> http://10.0.2.48:8983/solr/listsC/ failed - retrying ... retries: 3
> org.apache.solr.update.StreamingSolrServers$1; error
> Error while trying to recover. 
> core=listsC:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
>  We are not the leader
> Recovery failed - trying again... (0) core=listsC
> Stopping recovery for zkNodeName=core_node2core=listsC
> org.apache.solr.update.StreamingSolrServers$1; error
> org.apache.solr.common.SolrException: Service Unavailable
> {quote}
> I am not sure what is causing this, however it has happened a 3 times in the 
> past week. If there are any additional logs I can provide, or if there is 
> anything I can do to try to figure this out myself I will gladly try to help. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5670) _version_ either indexed OR docvalue

2014-01-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884076#comment-13884076
 ] 

Yago Riveiro commented on SOLR-5670:


The Solr guide are here [Solr Reference 
Guide|https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide]

> _version_ either indexed OR docvalue
> 
>
> Key: SOLR-5670
> URL: https://issues.apache.org/jira/browse/SOLR-5670
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.7
>Reporter: Per Steffensen
>Assignee: Per Steffensen
>  Labels: solr, solrcloud, version
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5670.patch, SOLR-5670.patch
>
>
> As far as I can see there is no good reason to require that "_version_" field 
> has to be indexed if it is docvalued. So I guess it will be ok with a rule 
> saying "_version_ has to be either indexed or docvalue (allowed to be both)".



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5670) _version_ either indexed OR docvalue

2014-01-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884076#comment-13884076
 ] 

Yago Riveiro edited comment on SOLR-5670 at 1/28/14 12:45 PM:
--

The Solr guide is here [Solr Reference 
Guide|https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide]


was (Author: yriveiro):
The Solr guide are here [Solr Reference 
Guide|https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide]

> _version_ either indexed OR docvalue
> 
>
> Key: SOLR-5670
> URL: https://issues.apache.org/jira/browse/SOLR-5670
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.7
>Reporter: Per Steffensen
>Assignee: Per Steffensen
>  Labels: solr, solrcloud, version
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5670.patch, SOLR-5670.patch
>
>
> As far as I can see there is no good reason to require that "_version_" field 
> has to be indexed if it is docvalued. So I guess it will be ok with a rule 
> saying "_version_ has to be either indexed or docvalue (allowed to be both)".



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5670) _version_ either indexed OR docvalue

2014-01-28 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883958#comment-13883958
 ] 

Yago Riveiro commented on SOLR-5670:


should be the wiki ref updated with this info?

This is a minor change, but when we are creating the schema, if we will 
leverage the docvalues feature, this kind of configurations can matter.

> _version_ either indexed OR docvalue
> 
>
> Key: SOLR-5670
> URL: https://issues.apache.org/jira/browse/SOLR-5670
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.7
>Reporter: Per Steffensen
>Assignee: Per Steffensen
>  Labels: solr, solrcloud, version
> Fix For: 5.0, 4.7
>
> Attachments: SOLR-5670.patch, SOLR-5670.patch
>
>
> As far as I can see there is no good reason to require that "_version_" field 
> has to be indexed if it is docvalued. So I guess it will be ok with a rule 
> saying "_version_ has to be either indexed or docvalue (allowed to be both)".



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5507) Admin UI - Refactoring using AngularJS

2014-01-04 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862393#comment-13862393
 ] 

Yago Riveiro edited comment on SOLR-5507 at 1/4/14 7:39 PM:


+1 for use bootstrap.

With an UI tool library with component to use "as is" and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.




was (Author: yriveiro):
+1 for use bootstrap.

With a UI tool library with component to use "as is" and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.



> Admin UI - Refactoring using AngularJS
> --
>
> Key: SOLR-5507
> URL: https://issues.apache.org/jira/browse/SOLR-5507
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
>
> On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
> talked about Refactoring the existing UI - using AngularJS: providing (more, 
> internal) structure and what not ;>
> He already started working on the Refactoring, so this is more a 'tracking' 
> issue about the progress he/we do there.
> Will extend this issue with a bit more context & additional information, w/ 
> thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5507) Admin UI - Refactoring using AngularJS

2014-01-04 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862393#comment-13862393
 ] 

Yago Riveiro edited comment on SOLR-5507 at 1/4/14 7:39 PM:


+1 for use bootstrap.

With an UI tool library with components to use "as is" and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.




was (Author: yriveiro):
+1 for use bootstrap.

With an UI tool library with component to use "as is" and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.



> Admin UI - Refactoring using AngularJS
> --
>
> Key: SOLR-5507
> URL: https://issues.apache.org/jira/browse/SOLR-5507
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
>
> On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
> talked about Refactoring the existing UI - using AngularJS: providing (more, 
> internal) structure and what not ;>
> He already started working on the Refactoring, so this is more a 'tracking' 
> issue about the progress he/we do there.
> Will extend this issue with a bit more context & additional information, w/ 
> thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5507) Admin UI - Refactoring using AngularJS

2014-01-04 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862393#comment-13862393
 ] 

Yago Riveiro commented on SOLR-5507:


+1 for use bootstrap.

With a UI tool library with component to use "as is" and a plugin system,  we 
will see a lot of new stuff inside the UI and this is good for the community.



> Admin UI - Refactoring using AngularJS
> --
>
> Key: SOLR-5507
> URL: https://issues.apache.org/jira/browse/SOLR-5507
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
>
> On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
> talked about Refactoring the existing UI - using AngularJS: providing (more, 
> internal) structure and what not ;>
> He already started working on the Refactoring, so this is more a 'tracking' 
> issue about the progress he/we do there.
> Will extend this issue with a bit more context & additional information, w/ 
> thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5507) Admin UI - Refactoring using AngularJS

2013-12-31 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859465#comment-13859465
 ] 

Yago Riveiro commented on SOLR-5507:


Ok, seems a valid argument :D.

If you release de code and some guide line about the architecture of the new 
UI, we can work in this new feature and see it in Solr soon.

> Admin UI - Refactoring using AngularJS
> --
>
> Key: SOLR-5507
> URL: https://issues.apache.org/jira/browse/SOLR-5507
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
>
> On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
> talked about Refactoring the existing UI - using AngularJS: providing (more, 
> internal) structure and what not ;>
> He already started working on the Refactoring, so this is more a 'tracking' 
> issue about the progress he/we do there.
> Will extend this issue with a bit more context & additional information, w/ 
> thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5507) Admin UI - Refactoring using AngularJS

2013-12-31 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859451#comment-13859451
 ] 

Yago Riveiro commented on SOLR-5507:


[~upayavira], What are the reasons for keeping the two UIs working together?

I understand that rewrite the whole UI is a epic task, but the time that we 
will spend thinking and implementing a way to have the new and the old UI 
working together can be used to finish the new and release it with a new 
release of Solr.

Also, in this transition, we will generate (most probably) new bugs and 
artefacts. With a point of time where we switch between both, all bugs will be 
about new UI.


> Admin UI - Refactoring using AngularJS
> --
>
> Key: SOLR-5507
> URL: https://issues.apache.org/jira/browse/SOLR-5507
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
>Priority: Minor
>
> On the LSR in Dublin, i've talked again to [~upayavira] and this time we 
> talked about Refactoring the existing UI - using AngularJS: providing (more, 
> internal) structure and what not ;>
> He already started working on the Refactoring, so this is more a 'tracking' 
> issue about the progress he/we do there.
> Will extend this issue with a bit more context & additional information, w/ 
> thoughts about the possible integration in the existing UI and more (:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5559:
---

Description: 
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:

{{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
main queue loop}}

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.

  was:
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main 
queue loop

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.


> DELETE collection command doesn't works in some cases
> -
>
> Key: SOLR-5559
> URL: https://issues.apache.org/jira/browse/SOLR-5559
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Yago Riveiro
>
> I think that I found a bug in DELETE collectionAPI command.
> Environment:
>   - N boxes, the number is not important.
>   - A collection with N shard spreed over the N boxes.
>   - Solr.xml old style.
>   
> I ran the command as 
> http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX
> The command return a 200 all was cleaned and in theory the collection was 
> removed ... but for some reason, one of the boxes doesn't delete the 
> references of CollectionX from the solr.xml and the folders of cores still 
> exists. The clusterstate.json doesn't have the CollectionX and the 
> /collections doesn't show the collectionX either.
> This result of this situation is an exception in overseer queue loop like 
> this:
> {{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
> main queue loop}}
> This exception stuck the queue and stoping the cluster. I think that is easy 
> replicate it with a test.
> I think that before to send an ok in DELETE command we must ensure that 
> nothing about this collection still existing on the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5559:
---

Description: 
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:

{{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
main queue loop}}

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test case.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.

  was:
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:

{{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
main queue loop}}

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.


> DELETE collection command doesn't works in some cases
> -
>
> Key: SOLR-5559
> URL: https://issues.apache.org/jira/browse/SOLR-5559
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Yago Riveiro
>
> I think that I found a bug in DELETE collectionAPI command.
> Environment:
>   - N boxes, the number is not important.
>   - A collection with N shard spreed over the N boxes.
>   - Solr.xml old style.
>   
> I ran the command as 
> http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX
> The command return a 200 all was cleaned and in theory the collection was 
> removed ... but for some reason, one of the boxes doesn't delete the 
> references of CollectionX from the solr.xml and the folders of cores still 
> exists. The clusterstate.json doesn't have the CollectionX and the 
> /collections doesn't show the collectionX either.
> This result of this situation is an exception in overseer queue loop like 
> this:
> {{org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
> main queue loop}}
> This exception stuck the queue and stoping the cluster. I think that is easy 
> replicate it with a test case.
> I think that before to send an ok in DELETE command we must ensure that 
> nothing about this collection still existing on the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5559:
---

Description: 
I think that I found a bug in DELETE collectionAPI command.

Environment:
  - N boxes, the number is not important.
  - A collection with N shard spreed over the N boxes.
  - Solr.xml old style.
  
I ran the command as 
http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX

The command return a 200 all was cleaned and in theory the collection was 
removed ... but for some reason, one of the boxes doesn't delete the references 
of CollectionX from the solr.xml and the folders of cores still exists. The 
clusterstate.json doesn't have the CollectionX and the /collections doesn't 
show the collectionX either.

This result of this situation is an exception in overseer queue loop like this:
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main 
queue loop

This exception stuck the queue and stoping the cluster. I think that is easy 
replicate it with a test.

I think that before to send an ok in DELETE command we must ensure that nothing 
about this collection still existing on the cluster.

> DELETE collection command doesn't works in some cases
> -
>
> Key: SOLR-5559
> URL: https://issues.apache.org/jira/browse/SOLR-5559
> Project: Solr
>  Issue Type: Bug
>    Affects Versions: 4.6
>Reporter: Yago Riveiro
>
> I think that I found a bug in DELETE collectionAPI command.
> Environment:
>   - N boxes, the number is not important.
>   - A collection with N shard spreed over the N boxes.
>   - Solr.xml old style.
>   
> I ran the command as 
> http://localhost:8983/sorl/admin/collections?action=DELETE&name=CollectionX
> The command return a 200 all was cleaned and in theory the collection was 
> removed ... but for some reason, one of the boxes doesn't delete the 
> references of CollectionX from the solr.xml and the folders of cores still 
> exists. The clusterstate.json doesn't have the CollectionX and the 
> /collections doesn't show the collectionX either.
> This result of this situation is an exception in overseer queue loop like 
> this:
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer 
> main queue loop
> This exception stuck the queue and stoping the cluster. I think that is easy 
> replicate it with a test.
> I think that before to send an ok in DELETE command we must ensure that 
> nothing about this collection still existing on the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5559) DELETE collection command doesn't works in some cases

2013-12-19 Thread Yago Riveiro (JIRA)

Yago Riveiro created SOLR-5559:
--

 Summary: DELETE collection command doesn't works in some cases
 Key: SOLR-5559
 URL: https://issues.apache.org/jira/browse/SOLR-5559
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Yago Riveiro






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-12-06 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841870#comment-13841870
 ] 

Yago Riveiro commented on SOLR-4260:


Replicas are still losing docs in Solr 4.6 :(.

I'm wondering if we can't have a pair (version, numDocs) to track the 
increments of docs between versions. Also we can save the last 10 tlogs in each 
replica as backups after be commited and make a diff to see what is missing in 
case the replicas are out of sync, replay the transaction and avoid a not 
synchronized replica and a full-recovery that probably will be heaviest that 
make the diff.

It's only and idea and of course find the bug must be the priority.

This issue compromisse Solr to be "the main" storage. If re-index data is not 
possible, we can't guarantee that no data is missing,  and worse, we lost the 
data forever :(.


> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Assignee: Mark Miller
>Priority: Critical
> Fix For: 5.0, 4.7
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827816#comment-13827816
 ] 

Yago Riveiro commented on SOLR-5428:


For me the utility of this patch is about the possibility to get distinctValues 
and countDistinct in a distribute environment. If it's possible implement this 
patch on top of AnalyticComponent I think that should be done,  by the simple 
fact that, eventually, the StatsComponent will be deprecated.

The question is that 
[SOLR-5302|https://issues.apache.org/jira/browse/SOLR-5302] will not be 
released soon, maybe in Solr 5.0, and in some way this patch is straightforward 
enough that can be released in Solr 4.7 with some tweaks. 

> new statistics results to StatsComponent - distinctValues and countDistinct
> ---
>
> Key: SOLR-5428
> URL: https://issues.apache.org/jira/browse/SOLR-5428
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5428.patch, SOLR-5428.patch
>
>
> I thought it would be very useful to display the distinct values (and the 
> count) of a field among other statistics. Attached a patch implementing this 
> in StatsComponent.
> Added results  :
> "distinctValues" - list of all distnict values
> "countDistinct" -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827788#comment-13827788
 ] 

Yago Riveiro commented on SOLR-5428:


I think that analytics component doesn't support distributed queries.

> new statistics results to StatsComponent - distinctValues and countDistinct
> ---
>
> Key: SOLR-5428
> URL: https://issues.apache.org/jira/browse/SOLR-5428
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5428.patch, SOLR-5428.patch
>
>
> I thought it would be very useful to display the distinct values (and the 
> count) of a field among other statistics. Attached a patch implementing this 
> in StatsComponent.
> Added results  :
> "distinctValues" - list of all distnict values
> "countDistinct" -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks

2013-11-20 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827764#comment-13827764
 ] 

Yago Riveiro commented on SOLR-5477:


Related with this feature we can add a notification panel in the UI.

> Async execution of OverseerCollectionProcessor tasks
> 
>
> Key: SOLR-5477
> URL: https://issues.apache.org/jira/browse/SOLR-5477
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>
> Typical collection admin commands are long running and it is very common to 
> have the requests get timed out.  It is more of a problem if the cluster is 
> very large.Add an option to run these commands asynchronously
> add an extra param async=true for all collection commands
> the task is written to ZK and the caller is returned a task id. 
> as separate collection admin command will be added to poll the status of the 
> task
> command=status&id=7657668909
> if id is not passed all running async tasks should be listed
> A separate queue is created to store in-process tasks . After the tasks are 
> completed the queue entry is removed. OverSeerColectionProcessor will perform 
> these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-19 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826533#comment-13826533
 ] 

Yago Riveiro commented on SOLR-4260:


I'm using   to enable per-field 
DocValues formats.

I think that this aspect about docValues it doesn't  explained on wiki in a 
proper way. There is no example how we can do the switch to default, do the 
forceMerge and switch back to the original implementation.

If I can't have the security that all will work fine,  I can't do the upgrade.

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Assignee: Mark Miller
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-19 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826356#comment-13826356
 ] 

Yago Riveiro commented on SOLR-4260:


It's safe upgrade from 4.5.1 to 4.6?. I have docValues and I read that it's not 
linear upgraded and I can't reindex the data.

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Assignee: Mark Miller
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-15 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823887#comment-13823887
 ] 

Yago Riveiro commented on SOLR-4260:


Mark, 

I can confirm that I had session expirations in my logs in some point of time. 
My index rate is high and some times my boxes are under some "pressure".

My problem is that I don't know how deal with the situation. I'm using a non 
java client and I don't know how I can do debug or the tools that I can use to 
give some information to help debug this issue.

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-15 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823867#comment-13823867
 ] 

Yago Riveiro commented on SOLR-4260:


{quote}I thought the updates are synchronously distributed{quote}

My knowledge about how replication is done is very limited, for me replication 
is a distributed HTTP requests to all replicas, if all responses return the 
code 200, then the insertion was successful. I don't know if internally the 200 
is returned when the document is written on tlog or in the open segment.

Up-to-date in this case is none, you have your data compromised, you can't 
guarantee wich is the correct replica, the logic could be pick the replica with 
more docs and make a new replica using it, but still can know without check one 
by one if you have all data. An extreme case can be do a full reindex of the 
data (if you can).


> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-15 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823517#comment-13823517
 ] 

Yago Riveiro commented on SOLR-4260:


Jessica, 

In some point of the process the leader can be downgraded to replica, the other 
replica whit less document will become the leader, in this case, the older 
leader (after the recovery) can be updated as usual and you get the leader 
behind the replica if the recovery doesn't fix the desviation.

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-14 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822612#comment-13822612
 ] 

Yago Riveiro commented on SOLR-5428:


Ok, I forgot that the StatsComponent return all metrics in one call.

Maybe the StatsCompement needs some tweaking to only return the metrics that we 
need and not all. If the analytics component could working with distributed 
searchs this patch would not necessary.

> new statistics results to StatsComponent - distinctValues and countDistinct
> ---
>
> Key: SOLR-5428
> URL: https://issues.apache.org/jira/browse/SOLR-5428
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5428.patch
>
>
> I thought it would be very useful to display the distinct values (and the 
> count) of a field among other statistics. Attached a patch implementing this 
> in StatsComponent.
> Added results  :
> "distinctValues" - list of all distnict values
> "countDistinct" -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-14 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822470#comment-13822470
 ] 

Yago Riveiro commented on SOLR-5428:


Collect the distinctValues can be expensive but in my case is a requirement 
that Solr can't give me in a easy way. I need to do a facet query limit -1 to 
get all uniq terms that match the query.

If the StatsComponent can do the same thing, expensive or not, I vote to have 
the feature. The way how use it and the pros and cons of use it must be a 
decision made by the user.

> new statistics results to StatsComponent - distinctValues and countDistinct
> ---
>
> Key: SOLR-5428
> URL: https://issues.apache.org/jira/browse/SOLR-5428
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5428.patch
>
>
> I thought it would be very useful to display the distinct values (and the 
> count) of a field among other statistics. Attached a patch implementing this 
> in StatsComponent.
> Added results  :
> "distinctValues" - list of all distnict values
> "countDistinct" -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-13 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821375#comment-13821375
 ] 

Yago Riveiro commented on SOLR-4260:


{quote} Currently, if shards eventually get out of whack, the best you can do 
is trigger a new recovery against the leader.{quote}

What happen when the leader is the shard with less docs? Is the replication 
done in the right way?

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-13 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821371#comment-13821371
 ] 

Yago Riveiro commented on SOLR-5428:


This tiny patch is very very useful.

One question, in the case of the Stats component, Is all work done on the heap 
or leverages the benefits of docValues?

> new statistics results to StatsComponent - distinctValues and countDistinct
> ---
>
> Key: SOLR-5428
> URL: https://issues.apache.org/jira/browse/SOLR-5428
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
>Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-5428.patch
>
>
> I thought it would be very useful to display the distinct values (and the 
> count) of a field among other statistics. Attached a patch implementing this 
> in StatsComponent.
> Added results  :
> "distinctValues" - list of all distnict values
> "countDistinct" -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2013-11-12 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820159#comment-13820159
 ] 

Yago Riveiro commented on SOLR-5428:


This patch works in distribute queries?

> new statistics results to StatsComponent - distinctValues and countDistinct
> ---
>
> Key: SOLR-5428
> URL: https://issues.apache.org/jira/browse/SOLR-5428
> Project: Solr
>  Issue Type: New Feature
>Reporter: Elran Dvir
> Attachments: SOLR-5428.patch
>
>
> I thought it would be very useful to display the distinct values (and the 
> count) of a field among other statistics. Attached a patch implementing this 
> in StatsComponent.
> Added results  :
> "distinctValues" - list of all distnict values
> "countDistinct" -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813850#comment-13813850
 ] 

Yago Riveiro edited comment on SOLR-4260 at 11/5/13 11:15 AM:
--

I attached some screenshots

The shard is the shard11:

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?


was (Author: yriveiro):
I attached some screenshots

The shard in question is the shard11:

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813850#comment-13813850
 ] 

Yago Riveiro edited comment on SOLR-4260 at 11/5/13 11:14 AM:
--

I attached some screenshots

The shard in question is the shard11:

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?


was (Author: yriveiro):
I attached some screenshots

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813850#comment-13813850
 ] 

Yago Riveiro commented on SOLR-4260:


I attached some screenshots

1 - clusterstate: this screenshot shows replica2 192.168.20.104 as the leader
2 - the replica 2 has lower gen that replica1 and is the leader, is this 
correct?

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-4260:
---

Attachment: 192.168.20.102-replica1.png
192.168.20.104-replica2.png
clusterstate.png

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
> Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

2013-11-05 Thread Yago Riveiro (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813827#comment-13813827
 ] 

Yago Riveiro commented on SOLR-4260:


Hi, I hit this bug with solr 4.5.1

replica 1:

lastModified:20 minutes ago
version:80616
numDocs:6072661
maxDoc:6072841
deletedDocs:180

replica 2 (leader)

lastModified:20 minutes ago
version:77595
numDocs:6072575
maxDoc:6072771
deletedDocs:196

I don't know when this happened, therefore I have no time frame to find in log 
valuable information on logs.

> Inconsistent numDocs between leader and replica
> ---
>
> Key: SOLR-4260
> URL: https://issues.apache.org/jira/browse/SOLR-4260
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 5.0
> Environment: 5.0.0.2013.01.04.15.31.51
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 5.0
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 124 matches

Mail list logo