[GitHub] lucene-solr issue #531: SOLR-12768

2019-01-08 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/531
  
Code changes seem good; I'll look closer and run tests.

PathHierarchyTokenizerFactory _might_ eventually be used at query time if 
we had a syntax to say "get me all my ancestors".  But given it's conditional 
based on syntax, it can't easily go into the query analyzer, so I think it's 
better to leave the query time chain simple/direct and we do this stuff when 
looking at the syntax.  I know this has evolved recently from where we were.  
In this issue I removed PathHierarchyTokenizer from the index side because I 
feel it's better to err on lighter-weight indexing at the expense of slower 
queries since I think it's very likely a very low cost for what I think are 
typical use-cases.

RE issue to discuss the syntax: yes a subtask of SOLR-12298.  no hurry on 
that one


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #531: SOLR-12768

2019-01-03 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/531
  
Nonetheless we could stop trying to solve this ambiguous case further _for 
now_ (thus commit what you have here) since (a) this syntax is very 
experimental (b) it's documented nowhere (c) and wasn't developed very openly.  
RE openness; given (a) & (b), the non-openness of (c) is okay but if we 
_really_ want to make this feature known, it deserves it's own issue to discuss 
with the syntax ought to be publicly.   A syntax to match paths is big enough 
that it shouldn't be buried within the scope of some other issue.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #531: SOLR-12768

2019-01-03 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/531
  
> Perhaps we could test whether "foo" is a defined field in the current 
collection?

Eh; that sounds too much like a "guess what the user might mean" kind of 
solution.  Even if "foo" is in the schema, it doesn't prevent an element "foo" 
from being used.

> I personally prefer keeping the API as simple as possible(I'm a believer 
of the KISS principle).

At the expense of ambiguity?  Trade-offs.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #:

2019-01-03 Thread dsmiley
Github user dsmiley commented on the pull request:


https://github.com/apache/lucene-solr/commit/c04a3ac306b017054173d986e32ea217e2fc6595#commitcomment-31829970
  
In 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java:
In 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
 on line 162:
I think this illustrates a problem with this syntax that you invented.  The 
code here says the absence of any '/' means it's a "regular filter, not 
hierarchy based".  But how then do you articulate a path filter for all 
elements named "foo" regardless of parentage?  So I think "childFilter" is 
trying to do double-duty here leading to some edge cases as we try to guess 
what was intended.  I propose instead we have a different local param 
"pathFilter".  Both filters will be AND'ed together.  WDYT?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #531: SOLR-12768

2019-01-02 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/531
  
The way this field is indexed has been experimental and will remain so for 
a bit.  8.0 will be it's more public debut, after which changing it will be 
more effort than today.  Nonetheless I will make this change clear in 
CHANGES.txt in case someone was using it.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #531: SOLR-12768

2019-01-02 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/531
  
Looks good.  I thought you were going to test with something to show we 
don't screw it up?  You could search for 'redients' which would have returned 
something before but now should not.

I wonder... hmmm... maybe the nest path should always start with a '/'?  It 
would seem more correct since paths in general usually start with one, and it 
would also make implementing this case simpler.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #531: SOLR-12768

2019-01-01 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/531
  
Thanks for helping out!
It's insufficient to simply prepend a `*` since it would match other labels 
that share the same suffix.  For example: 
`childFilter='ingredients/name\_s:cocoa'` would get transformed into a query 
that might match a label `secretingredients` but we don't want it to.  I think 
a solution is to build an OR query where one clause has a `*/` prepended to it, 
and the other has nothing modified and so matches the term exactly.  WDYT?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #528: SOLR-12955 2

2018-12-31 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/528
  
Man this is a tangled mess we are trying to unravel!  I wish someone 
undertook this years ago.

Lets not rename DUP#processDelete as it is an URP defined method we 
override, and I doubt it'd worth it to expand the change to URP.

postProcessDeleteByQuery can be renamed to doDistribDeleteByQuery, and 
postProcessDeleteById can be renamed doDistribDeleteById.

The flow of doDeleteById & doDeleteByQuery are different, which is a bit 
annoying. It's hard for me to make much sense of it right now.  I know the ZK 
version entirely overrides processCommit but I think there the flow was more 
clear, at least to me.  But on the delete side, I find it confusing that the ZK 
version overrides doDeleteByQuery. ... gotta go for now

Let me know when you're done with the issue and I can try some thing too.  
It's such a time sink unfortunately.



---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #528: SOLR-12955 2

2018-12-26 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/528
  
I like that the zk version of processCommit completely overrides the 
superclass since the logic is much more complicated.  But it's a little 
unfortunate to call super.processCommit which will have some redundant 
setupRequest.  One slight change would be for these method calls to change back 
to call localCommit() as they did before, but also put the cmd.setVersion logic 
at the front of localCommit's logic since this is always done first.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #528: SOLR-12955 2

2018-12-26 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/528
  
* DUP.next can be removed as it's redundant with the same in the superclass
* DUP.MAX_RETRIES_TO_FOLLOWERS_DEFAULT you accidentally added an 'R' to the 
** of the javadoc.
* DUP.isLeader method ought to go  after the constructor.  Generally, the 
constructor goes before instance methods.
* DUP.isIndexChanged() remove this method; strictly internal and it's 
inconsistent with the fact that there are many fields that don't have 
accessors, so why should this one.
* DUP.versionAdd shouldClone:  I think it would be a little simpler to have 
the second conditional to "shouldClone" instead check if clonedDoc != null.  At 
that point you will only have one var reference to shouldClone and you could 
inline it to how you initialize it.
* DUP.postProcessAdd: I think this method would be better named doDistribAdd
* Maybe we can avoid DistributedZkUpdateProcessor overriding processAdd.  
Somehow  checkReplicationTracker() needs to get called.  One way to do this 
would be for setupRequest to check if the updateCommand is an instance of 
AddUpdateCommand)).
* It'd be excellent if we could get some consistency of calling 
setupRequest -- all processXXX methods should first call setupRequest and if 
needed an overloaded variant.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #528: SOLR-12955 2

2018-12-26 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/528
  
* DUP.postProcessAdd's javadoc is confusing; maybe drop it.  It says "This 
method can be overridden to tamper with the cmd" but I'm looking at where we 
override it and I don't see any tampering to this.updateCommand.
* DUP: Lets remove the Standalone subclass in favor of making the base 
class do the standalone functionality; I think it's not really doing much of 
anything other than relying on the base impl.  There's practically no 
functionality (real code) in the standalone class right now so it's kind of 
pointless.
* DistributedZkUpdateProcessor's constructor that takes a docMerger isn't 
called externally so lets have just one constructor


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-26 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r243999855
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandaloneUpdateProcessor.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.CompletionService;
+import java.util.concurrent.ExecutorCompletionService;
+import java.util.concurrent.Future;
+
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.CommitUpdateCommand;
+import org.apache.solr.update.DeleteUpdateCommand;
+import org.apache.solr.update.UpdateCommand;
+import org.apache.solr.util.TestInjection;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class DistributedStandaloneUpdateProcessor extends 
DistributedUpdateProcessor {
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
+  SolrQueryResponse rsp, 
UpdateRequestProcessor next) {
+super(req, rsp, next);
+  }
+
+  public DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
+  SolrQueryResponse rsp, 
AtomicUpdateDocumentMerger docMerger,
+  UpdateRequestProcessor next) 
{
+super(req, rsp, docMerger, next);
+  }
+
+  @Override
+  Replica.Type computeReplicaType() {
+return Replica.Type.NRT;
+  }
+
+  @Override
+  public void processCommit(CommitUpdateCommand cmd) throws IOException {
+
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+
+CompletionService completionService = new 
ExecutorCompletionService<>(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+Set> pending = new HashSet<>();
+if (replicaType == Replica.Type.TLOG) {
+  if (isLeader) {
+super.processCommit(cmd);
+  }
+} else if (replicaType == Replica.Type.PULL) {
+  log.warn("Commit not supported on replicas of type " + 
Replica.Type.PULL);
+} else {
+  // NRT replicas will always commit
+  super.processCommit(cmd);
+}
+  }
+
+  @Override
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+isLeader = getNonZkLeaderAssumption(req);
+
+super.processAdd(cmd);
+  }
+
+  @Override
+  protected void doDeleteById(DeleteUpdateCommand cmd) throws IOException {
+isLeader = getNonZkLeaderAssumption(req);
+super.doDeleteById(cmd);
+  }
+
+  @Override
+  protected void doDeleteByQuery(DeleteUpdateCommand cmd) throws 
IOException {
+// even in non zk mode, tests simulate updates from a leader
+isLeader = getNonZkLeaderAssumption(req);
+super.doDeleteByQuery(cmd, null, null);
+  }
+
+  @Override
+  public void finish() throws IOException {
--- End diff --

nevermind; just leave this be then.  Lets keep our changes to more 
clear/obvious ones.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-24 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r243847609
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1250,23 +1250,10 @@ private UpdateCommand 
fetchFullUpdateFromLeader(AddUpdateCommand inplaceAdd, lon
 params.set("onlyIfActive", true);
 SolrRequest ur = new 
GenericSolrRequest(METHOD.GET, "/get", params);
 
-String leaderUrl = req.getParams().get(DISTRIB_FROM);
-
-if (leaderUrl == null) {
-  // An update we're dependent upon didn't arrive! This is unexpected. 
Perhaps likely our leader is
-  // down or partitioned from us for some reason. Lets force refresh 
cluster state, and request the
-  // leader for the update.
-  if (zkController == null) { // we should be in cloud mode, but wtf? 
could be a unit test
-throw new SolrException(ErrorCode.SERVER_ERROR, "Can't find 
document with id=" + id + ", but fetching from leader "
-+ "failed since we're not in cloud mode.");
-  }
-  Replica leader;
-  try {
-leader = 
zkController.getZkStateReader().getLeaderRetry(collection, 
cloudDesc.getShardId());
-  } catch (InterruptedException e) {
-throw new SolrException(ErrorCode.SERVER_ERROR, "Exception during 
fetching from leader.", e);
-  }
-  leaderUrl = leader.getCoreUrl();
+String leaderUrl = getLeaderUrl(id);
--- End diff --

I looked again and it seems fine.  I could be the diff I was looking at 
confused the matter.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-24 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r243847326
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandaloneUpdateProcessor.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.CompletionService;
+import java.util.concurrent.ExecutorCompletionService;
+import java.util.concurrent.Future;
+
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.CommitUpdateCommand;
+import org.apache.solr.update.DeleteUpdateCommand;
+import org.apache.solr.update.UpdateCommand;
+import org.apache.solr.util.TestInjection;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class DistributedStandaloneUpdateProcessor extends 
DistributedUpdateProcessor {
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
+  SolrQueryResponse rsp, 
UpdateRequestProcessor next) {
+super(req, rsp, next);
+  }
+
+  public DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
+  SolrQueryResponse rsp, 
AtomicUpdateDocumentMerger docMerger,
+  UpdateRequestProcessor next) 
{
+super(req, rsp, docMerger, next);
+  }
+
+  @Override
+  Replica.Type computeReplicaType() {
+return Replica.Type.NRT;
+  }
+
+  @Override
+  public void processCommit(CommitUpdateCommand cmd) throws IOException {
+
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+
+CompletionService completionService = new 
ExecutorCompletionService<>(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+Set> pending = new HashSet<>();
+if (replicaType == Replica.Type.TLOG) {
+  if (isLeader) {
+super.processCommit(cmd);
+  }
+} else if (replicaType == Replica.Type.PULL) {
+  log.warn("Commit not supported on replicas of type " + 
Replica.Type.PULL);
+} else {
+  // NRT replicas will always commit
+  super.processCommit(cmd);
+}
+  }
+
+  @Override
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+isLeader = getNonZkLeaderAssumption(req);
+
+super.processAdd(cmd);
+  }
+
+  @Override
+  protected void doDeleteById(DeleteUpdateCommand cmd) throws IOException {
+isLeader = getNonZkLeaderAssumption(req);
+super.doDeleteById(cmd);
+  }
+
+  @Override
+  protected void doDeleteByQuery(DeleteUpdateCommand cmd) throws 
IOException {
+// even in non zk mode, tests simulate updates from a leader
+isLeader = getNonZkLeaderAssumption(req);
+super.doDeleteByQuery(cmd, null, null);
+  }
+
+  @Override
+  public void finish() throws IOException {
--- End diff --

isn't this what a base finish() will do?  so why subclass.  If there really 
is a reason then a comment here would help


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-24 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r243847270
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandaloneUpdateProcessor.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.concurrent.CompletionService;
+import java.util.concurrent.ExecutorCompletionService;
+import java.util.concurrent.Future;
+
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.CommitUpdateCommand;
+import org.apache.solr.update.DeleteUpdateCommand;
+import org.apache.solr.update.UpdateCommand;
+import org.apache.solr.util.TestInjection;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class DistributedStandaloneUpdateProcessor extends 
DistributedUpdateProcessor {
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
+  SolrQueryResponse rsp, 
UpdateRequestProcessor next) {
+super(req, rsp, next);
+  }
+
+  public DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
+  SolrQueryResponse rsp, 
AtomicUpdateDocumentMerger docMerger,
+  UpdateRequestProcessor next) 
{
+super(req, rsp, docMerger, next);
+  }
+
+  @Override
+  Replica.Type computeReplicaType() {
+return Replica.Type.NRT;
+  }
+
+  @Override
+  public void processCommit(CommitUpdateCommand cmd) throws IOException {
+
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+
+CompletionService completionService = new 
ExecutorCompletionService<>(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
+Set> pending = new HashSet<>();
+if (replicaType == Replica.Type.TLOG) {
+  if (isLeader) {
+super.processCommit(cmd);
+  }
+} else if (replicaType == Replica.Type.PULL) {
+  log.warn("Commit not supported on replicas of type " + 
Replica.Type.PULL);
+} else {
+  // NRT replicas will always commit
+  super.processCommit(cmd);
+}
+  }
+
+  @Override
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+isLeader = getNonZkLeaderAssumption(req);
+
+super.processAdd(cmd);
+  }
+
+  @Override
+  protected void doDeleteById(DeleteUpdateCommand cmd) throws IOException {
+isLeader = getNonZkLeaderAssumption(req);
+super.doDeleteById(cmd);
+  }
+
+  @Override
+  protected void doDeleteByQuery(DeleteUpdateCommand cmd) throws 
IOException {
+// even in non zk mode, tests simulate updates from a leader
+isLeader = getNonZkLeaderAssumption(req);
--- End diff --

I think just call setupRequest instead of getNonZkLeaderAssumption?  
setupRequest calls that and sets isLeader.  This comment goes for all/most of 
the methods here.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-24 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r243844185
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1250,23 +1250,10 @@ private UpdateCommand 
fetchFullUpdateFromLeader(AddUpdateCommand inplaceAdd, lon
 params.set("onlyIfActive", true);
 SolrRequest ur = new 
GenericSolrRequest(METHOD.GET, "/get", params);
 
-String leaderUrl = req.getParams().get(DISTRIB_FROM);
-
-if (leaderUrl == null) {
-  // An update we're dependent upon didn't arrive! This is unexpected. 
Perhaps likely our leader is
-  // down or partitioned from us for some reason. Lets force refresh 
cluster state, and request the
-  // leader for the update.
-  if (zkController == null) { // we should be in cloud mode, but wtf? 
could be a unit test
-throw new SolrException(ErrorCode.SERVER_ERROR, "Can't find 
document with id=" + id + ", but fetching from leader "
-+ "failed since we're not in cloud mode.");
-  }
-  Replica leader;
-  try {
-leader = 
zkController.getZkStateReader().getLeaderRetry(collection, 
cloudDesc.getShardId());
-  } catch (InterruptedException e) {
-throw new SolrException(ErrorCode.SERVER_ERROR, "Exception during 
fetching from leader.", e);
-  }
-  leaderUrl = leader.getCoreUrl();
+String leaderUrl = getLeaderUrl(id);
--- End diff --

I'm not sure how this diff is a refactoring (keeps semantics identical).  
For example... the exception message isn't the same.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-24 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r243843542
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandaloneUpdateProcessor.java
 ---
@@ -42,15 +53,36 @@ public 
DistributedStandaloneUpdateProcessor(SolrQueryRequest req,
   }
 
   @Override
-  String getCollectionName(CloudDescriptor cloudDesc) {
+  String computeCollectionName(CloudDescriptor cloudDesc) {
 return null;
   }
 
   @Override
-  Replica.Type getReplicaType(CloudDescriptor cloudDesc) {
+  Replica.Type computeReplicaType(CloudDescriptor cloudDesc) {
 return Replica.Type.NRT;
   }
 
+  @Override
+  public void processCommit(CommitUpdateCommand cmd) throws IOException {
+
+assert TestInjection.injectFailUpdateRequests();
+
+updateCommand = cmd;
+
+CompletionService completionService = new 
ExecutorCompletionService<>(req.getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor());
--- End diff --

Why is all this stuff here in standalone mode?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #528: SOLR-12955 2

2018-12-17 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/528
  
Looks good.  More to do?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-15 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r241967674
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1417,45 +1407,7 @@ private void doDeleteById(DeleteUpdateCommand cmd) 
throws IOException {
   return;
 }
 
-if (zkEnabled && isLeader && !isSubShardLeader)  {
-  DocCollection coll = 
zkController.getClusterState().getCollection(collection);
-  List subShardLeaders = getSubShardLeaders(coll, 
cloudDesc.getShardId(), cmd.getId(), null);
-  // the list will actually have only one element for an add 
request
-  if (subShardLeaders != null && !subShardLeaders.isEmpty()) {
-ModifiableSolrParams params = new 
ModifiableSolrParams(filterParams(req.getParams()));
-params.set(DISTRIB_UPDATE_PARAM, 
DistribPhase.FROMLEADER.toString());
-params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
-zkController.getBaseUrl(), req.getCore().getName()));
-params.set(DISTRIB_FROM_PARENT, cloudDesc.getShardId());
-cmdDistrib.distribDelete(cmd, subShardLeaders, params, true, null, 
null);
-  }
-
-  final List nodesByRoutingRules = 
getNodesByRoutingRules(zkController.getClusterState(), coll, cmd.getId(), null);
-  if (nodesByRoutingRules != null && !nodesByRoutingRules.isEmpty())  {
-ModifiableSolrParams params = new 
ModifiableSolrParams(filterParams(req.getParams()));
-params.set(DISTRIB_UPDATE_PARAM, 
DistribPhase.FROMLEADER.toString());
-params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
-zkController.getBaseUrl(), req.getCore().getName()));
-params.set(DISTRIB_FROM_COLLECTION, collection);
-params.set(DISTRIB_FROM_SHARD, cloudDesc.getShardId());
-cmdDistrib.distribDelete(cmd, nodesByRoutingRules, params, true, 
null, null);
-  }
-}
-
-if (nodes != null) {
-  ModifiableSolrParams params = new 
ModifiableSolrParams(filterParams(req.getParams()));
-  params.set(DISTRIB_UPDATE_PARAM,
-  (isLeader || isSubShardLeader ? 
DistribPhase.FROMLEADER.toString()
-  : DistribPhase.TOLEADER.toString()));
-  params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
-  zkController.getBaseUrl(), req.getCore().getName()));
-
-  if (req.getParams().get(UpdateRequest.MIN_REPFACT) != null) {
-// TODO: Kept for rolling upgrades only. Remove in Solr 9
-params.add(UpdateRequest.MIN_REPFACT, 
req.getParams().get(UpdateRequest.MIN_REPFACT));
-  }
-  cmdDistrib.distribDelete(cmd, nodes, params, false, 
rollupReplicationTracker, leaderReplicationTracker);
-}
+postProcessDeleteById(cmd);
--- End diff --

I can understand why you took the current approach.  Lets put this off 
towards near the end to see if it's straight-forward or not.  Maybe it won't be 
and we shouldn't do it.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-15 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r241967637
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandaloneUpdateProcessor.java
 ---
@@ -56,8 +59,15 @@ public void doDeleteByQuery(DeleteUpdateCommand cmd) 
throws IOException {
 super.doDeleteByQuery(cmd, Collections.emptyList(), null);
   }
 
+  @Override
+  protected void postProcessDeleteByQuery(DeleteUpdateCommand cmd,
+  List 
replicas, DocCollection coll) throws IOException {
+// no op
+  }
+
   @Override
   boolean isLeader(UpdateCommand cmd) {
+updateCommand = cmd;
--- End diff --

+1 to setupRequest name.  I don't think it needs to return "isLeader"; it 
should set this and return void.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-15 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r241945926
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1417,45 +1407,7 @@ private void doDeleteById(DeleteUpdateCommand cmd) 
throws IOException {
   return;
 }
 
-if (zkEnabled && isLeader && !isSubShardLeader)  {
-  DocCollection coll = 
zkController.getClusterState().getCollection(collection);
-  List subShardLeaders = getSubShardLeaders(coll, 
cloudDesc.getShardId(), cmd.getId(), null);
-  // the list will actually have only one element for an add 
request
-  if (subShardLeaders != null && !subShardLeaders.isEmpty()) {
-ModifiableSolrParams params = new 
ModifiableSolrParams(filterParams(req.getParams()));
-params.set(DISTRIB_UPDATE_PARAM, 
DistribPhase.FROMLEADER.toString());
-params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
-zkController.getBaseUrl(), req.getCore().getName()));
-params.set(DISTRIB_FROM_PARENT, cloudDesc.getShardId());
-cmdDistrib.distribDelete(cmd, subShardLeaders, params, true, null, 
null);
-  }
-
-  final List nodesByRoutingRules = 
getNodesByRoutingRules(zkController.getClusterState(), coll, cmd.getId(), null);
-  if (nodesByRoutingRules != null && !nodesByRoutingRules.isEmpty())  {
-ModifiableSolrParams params = new 
ModifiableSolrParams(filterParams(req.getParams()));
-params.set(DISTRIB_UPDATE_PARAM, 
DistribPhase.FROMLEADER.toString());
-params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
-zkController.getBaseUrl(), req.getCore().getName()));
-params.set(DISTRIB_FROM_COLLECTION, collection);
-params.set(DISTRIB_FROM_SHARD, cloudDesc.getShardId());
-cmdDistrib.distribDelete(cmd, nodesByRoutingRules, params, true, 
null, null);
-  }
-}
-
-if (nodes != null) {
-  ModifiableSolrParams params = new 
ModifiableSolrParams(filterParams(req.getParams()));
-  params.set(DISTRIB_UPDATE_PARAM,
-  (isLeader || isSubShardLeader ? 
DistribPhase.FROMLEADER.toString()
-  : DistribPhase.TOLEADER.toString()));
-  params.set(DISTRIB_FROM, ZkCoreNodeProps.getCoreUrl(
-  zkController.getBaseUrl(), req.getCore().getName()));
-
-  if (req.getParams().get(UpdateRequest.MIN_REPFACT) != null) {
-// TODO: Kept for rolling upgrades only. Remove in Solr 9
-params.add(UpdateRequest.MIN_REPFACT, 
req.getParams().get(UpdateRequest.MIN_REPFACT));
-  }
-  cmdDistrib.distribDelete(cmd, nodes, params, false, 
rollupReplicationTracker, leaderReplicationTracker);
-}
+postProcessDeleteById(cmd);
--- End diff --

I wonder if we actually need this postPorcessDeleteById(cmd) method... 
could the subclass put its logic after calling super.doDeleteById()?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-15 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r241945549
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedZkUpdateProcessor.java
 ---
@@ -223,6 +289,7 @@ private String getLeaderUrlZk(String id) {
 
   @Override
   boolean isLeader(UpdateCommand cmd) {
+updateCommand = cmd;
--- End diff --

again; this line makes me frown


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-15 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r241945502
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandaloneUpdateProcessor.java
 ---
@@ -56,8 +59,15 @@ public void doDeleteByQuery(DeleteUpdateCommand cmd) 
throws IOException {
 super.doDeleteByQuery(cmd, Collections.emptyList(), null);
   }
 
+  @Override
+  protected void postProcessDeleteByQuery(DeleteUpdateCommand cmd,
+  List 
replicas, DocCollection coll) throws IOException {
+// no op
+  }
+
   @Override
   boolean isLeader(UpdateCommand cmd) {
+updateCommand = cmd;
--- End diff --

ouch this piece is ugly.  a method like "is..." (or get...)  should not 
have side effects.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #528: SOLR-12955 2

2018-12-15 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/528#discussion_r241945460
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedZkUpdateProcessor.java
 ---
@@ -58,6 +75,127 @@ String getCollectionName(CloudDescriptor cloudDesc) {
 return cloudDesc.getReplicaType();
   }
 
+  @Override
+  public void doDeleteByQuery(DeleteUpdateCommand cmd) throws IOException {
+// even in non zk mode, tests simulate updates from a leader
--- End diff --

this particular comment is I think intended for the standalone mode


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #519: WIP: SOLR-12955: rebase and split changes to smaller...

2018-12-10 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/519
  
I'm not sure if the "base" functionality needed to be in a new class... I 
could imagine simply adding subclasses and leave DUP named as it is with it's 
base functionality.  WDYT?  It may help keep git history better too.
p.s. I'm GMT+1 this week


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #519: WIP: SOLR-12955: rebase and split changes to smaller...

2018-12-09 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/519
  
good start


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #489: SOLR-12955: separate DistribUpdateProcessor to Zk an...

2018-12-09 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/489
  
There is no "development" branch, so I don't know what you mean by that.  
You don't _have_ to run the full test suite after each commit, but _ideally_ it 
ought to pass on each commit if it were to be run.  I don't think I will do it. 
 It's up to you to pause on each commit for my input but I don't think 
necessary since I'm overall happy with the final result, I just want to see the 
incremental steps to observe where the code is going at each juncture.  When I 
finally commit something, I'm not sure yet if I'm going to do it in one commit 
or do it separately... but don't let that stop you from breaking it into as 
small steps as you please.  Please use commit comments to point out anything 
non-obvious.  The purpose here is mostly to show what went where since a single 
all-in-one change is kinda unreviewable (or at least would require committing 
in blind faith).


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #489: SOLR-12955: separate DistribUpdateProcessor t...

2018-12-03 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/489#discussion_r238527873
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedStandAloneUpdateProcessor.java
 ---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.DeleteUpdateCommand;
+import org.apache.solr.update.UpdateCommand;
+
+import static 
org.apache.solr.update.processor.DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM;
+
+public class DistributedStandAloneUpdateProcessor extends 
DistributedUpdateProcessor {
--- End diff --

preferably lowercase the 'A' thus "DistributedStandaloneUpdateProcessor"


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #489: SOLR-12955: separate DistribUpdateProcessor t...

2018-12-03 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/489#discussion_r238528464
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/CdcrUpdateProcessor.java ---
@@ -16,115 +16,12 @@
  */
 package org.apache.solr.update.processor;
 
-import java.io.IOException;
-import java.lang.invoke.MethodHandles;
-
-import org.apache.solr.common.params.CommonParams;
-import org.apache.solr.common.params.ModifiableSolrParams;
-import org.apache.solr.common.params.SolrParams;
-import org.apache.solr.request.SolrQueryRequest;
-import org.apache.solr.response.SolrQueryResponse;
-import org.apache.solr.update.AddUpdateCommand;
-import org.apache.solr.update.DeleteUpdateCommand;
-import org.apache.solr.update.UpdateCommand;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
 /**
  * 
- * Extends {@link 
org.apache.solr.update.processor.DistributedUpdateProcessor} to force peer sync 
logic
- * for every updates. This ensures that the version parameter sent by the 
source cluster is kept
- * by the target cluster.
+ * An Interface for Shared params for Cdcr Update Processors
--- End diff --

Do we really need this interface just for this param?  I think not.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #489: SOLR-12955: separate DistribUpdateProcessor t...

2018-12-03 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/489#discussion_r238527608
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessorFactory.java
 ---
@@ -50,8 +51,16 @@ public static void 
addParamToDistributedRequestWhitelist(final SolrQueryRequest
   public UpdateRequestProcessor getInstance(SolrQueryRequest req,
   SolrQueryResponse rsp, UpdateRequestProcessor next) {
 // note: will sometimes return DURP (no overhead) instead of wrapping
--- End diff --

this comment is now misplaced.  Before it was immediately prior to 
TimeRoutedAliasUpdateProcessor.wrap and I think it should be maintained as-such.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #455: SOLR-12638

2018-11-28 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/455
  
What does "so _root_ is not hard-coded into childless documents" mean?  If 
you think SOLR-5211 is missing something, can you please post a patch file to 
the issue and explain what you're adding?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-20 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r235022057
  
--- Diff: solr/core/src/test/org/apache/solr/BlockUpdateTest.java ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr;
+
+import java.util.List;
+
+import org.apache.solr.SolrJettyTestBase;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.common.SolrDocument;
+import org.apache.solr.common.SolrDocumentList;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CommonParams;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+
+import static org.hamcrest.CoreMatchers.is;
+
+public class BlockUpdateTest extends SolrJettyTestBase {
--- End diff --

I think it tests with / without root field then, so it's name could reflect 
_that_.  e.g. RootFieldTest


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-20 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r235001356
  
--- Diff: solr/core/src/test/org/apache/solr/BlockUpdateTest.java ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr;
+
+import java.util.List;
+
+import org.apache.solr.SolrJettyTestBase;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.common.SolrDocument;
+import org.apache.solr.common.SolrDocumentList;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CommonParams;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+
+import static org.hamcrest.CoreMatchers.is;
+
+public class BlockUpdateTest extends SolrJettyTestBase {
--- End diff --

Here's that "Block" terminology again.  Can the tests here merge with 
another test for nested docs? (one you made?)


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #501: SOLR-5211

2018-11-19 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/501
  
Nice.
Do we even need the test config / schema at all?  If they don't add 
anything, we shouldn't have them.  Also I see there is still some "Block" 
terminology added here we can avoid to prefer "Nested".



---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-19 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234869970
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -194,20 +197,33 @@ public String getHashableId() {
   return null; // caller should call getLuceneDocument() instead
 }
 
-String rootId = getHashableId();
-
-boolean isVersion = version != 0;
+final String rootId = getHashableId();
+final boolean hasVersion = version != 0;
+final SolrInputField versionSif = hasVersion? 
solrDoc.get(CommonParams.VERSION_FIELD): null;
--- End diff --

What I meant was that *if* the solrDoc has a \_version\_ then we can copy 
it to all nested children.  We need not examine this.version at all.  At least 
this is my understanding of reading the code; it'd be good to ensure tests pass.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #501: SOLR-5211

2018-11-19 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/501
  
Also please rename "block" terminology to "nested" where possible.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-19 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234646277
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java ---
@@ -975,6 +976,14 @@ private void updateDocOrDocValues(AddUpdateCommand 
cmd, IndexWriter writer) thro
 }
   }
 
+  boolean forciblyAddBlockId() {
--- End diff --

Come to think of it, who cares if root is populated or not for "old" 
schemas?  Lets just do it always; no conditional test.  If someone doesn't want 
root to be populated, that same person probably doesn't have child docs and 
should just as well omit the root field from their schema.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-18 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234498494
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -194,20 +203,22 @@ public String getHashableId() {
   return null; // caller should call getLuceneDocument() instead
 }
 
-String rootId = getHashableId();
-
-boolean isVersion = version != 0;
-
+final String rootId = getHashableId();
 for (SolrInputDocument sdoc : all) {
-  sdoc.setField(IndexSchema.ROOT_FIELD_NAME, rootId);
-  if(isVersion) sdoc.setField(CommonParams.VERSION_FIELD, version);
-  // TODO: if possible concurrent modification exception (if 
SolrInputDocument not cloned and is being forwarded to replicas)
-  // then we could add this field to the generated lucene document 
instead.
+  addBlockId(sdoc, rootId);
 }
 
 return () -> all.stream().map(sdoc -> DocumentBuilder.toDocument(sdoc, 
req.getSchema())).iterator();
   }
 
+  private void addBlockId(SolrInputDocument sdoc, String rootId) {
+boolean isVersion = version != 0;
+sdoc.setField(IndexSchema.ROOT_FIELD_NAME, rootId);
+if(isVersion) sdoc.setField(CommonParams.VERSION_FIELD, version);
--- End diff --

although it may be a slight bit of scope creep; I think this logic would be 
more clear why it's needed if it merely propagated the SolrInputField from the 
root doc to the child docs.  My understanding is that this will wind up being 
equivalent to what happens now, which is very non-obvious; it took a bunch of 
digging on my part to see.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-18 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234496047
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -96,12 +96,21 @@ public SolrInputDocument getSolrInputDocument() {
* Any changes made to the returned Document will not be reflected in 
the SolrInputDocument, or future calls to this
* method.
* Note that the behavior of this is sensitive to {@link 
#isInPlaceUpdate()}.
+   * @param withBlockId If true, then block id is forcibly added to the doc
*/
-   public Document getLuceneDocument() {
+   Document getLuceneDocument(boolean withBlockId) {
--- End diff --

actually I'm not sure we actually need to overload getLuceneDocument with 
this boolean after all.  if "isInPlaceUpdate()" then don't add root but 
otherwise add it.  That'll be fine; right?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-18 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234496573
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java ---
@@ -975,6 +976,14 @@ private void updateDocOrDocValues(AddUpdateCommand 
cmd, IndexWriter writer) thro
 }
   }
 
+  boolean forciblyAddBlockId() {
--- End diff --

this is here in part so that we can test _not_ doing this (i.e. current 
behavior)... but I don't think it's worth it.  If we want to _not_ do it 
conditionally then it could be via luceneMatchVersion but I don't care either 
way as it's minor to add it unconditionally.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-18 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234495484
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -96,12 +96,21 @@ public SolrInputDocument getSolrInputDocument() {
* Any changes made to the returned Document will not be reflected in 
the SolrInputDocument, or future calls to this
* method.
* Note that the behavior of this is sensitive to {@link 
#isInPlaceUpdate()}.
+   * @param withBlockId If true, then block id is forcibly added to the doc
*/
-   public Document getLuceneDocument() {
+   Document getLuceneDocument(boolean withBlockId) {
--- End diff --

can we call this addRootField ?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-18 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234496324
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -194,20 +203,22 @@ public String getHashableId() {
   return null; // caller should call getLuceneDocument() instead
 }
 
-String rootId = getHashableId();
-
-boolean isVersion = version != 0;
-
+final String rootId = getHashableId();
 for (SolrInputDocument sdoc : all) {
-  sdoc.setField(IndexSchema.ROOT_FIELD_NAME, rootId);
-  if(isVersion) sdoc.setField(CommonParams.VERSION_FIELD, version);
-  // TODO: if possible concurrent modification exception (if 
SolrInputDocument not cloned and is being forwarded to replicas)
-  // then we could add this field to the generated lucene document 
instead.
+  addBlockId(sdoc, rootId);
 }
 
 return () -> all.stream().map(sdoc -> DocumentBuilder.toDocument(sdoc, 
req.getSchema())).iterator();
   }
 
+  private void addBlockId(SolrInputDocument sdoc, String rootId) {
--- End diff --

Can we rename this addRootField?  and separate out the version part to 
addVersionField which should _not_ be called by this method; the caller should. 
 addVersionField only needs to be called for the nested case, not by 
getLuceneDocument.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #501: SOLR-5211

2018-11-18 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/501#discussion_r234496354
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -194,20 +203,22 @@ public String getHashableId() {
   return null; // caller should call getLuceneDocument() instead
 }
 
-String rootId = getHashableId();
-
-boolean isVersion = version != 0;
-
+final String rootId = getHashableId();
 for (SolrInputDocument sdoc : all) {
-  sdoc.setField(IndexSchema.ROOT_FIELD_NAME, rootId);
-  if(isVersion) sdoc.setField(CommonParams.VERSION_FIELD, version);
-  // TODO: if possible concurrent modification exception (if 
SolrInputDocument not cloned and is being forwarded to replicas)
-  // then we could add this field to the generated lucene document 
instead.
+  addBlockId(sdoc, rootId);
 }
 
 return () -> all.stream().map(sdoc -> DocumentBuilder.toDocument(sdoc, 
req.getSchema())).iterator();
   }
 
+  private void addBlockId(SolrInputDocument sdoc, String rootId) {
+boolean isVersion = version != 0;
+sdoc.setField(IndexSchema.ROOT_FIELD_NAME, rootId);
+if(isVersion) sdoc.setField(CommonParams.VERSION_FIELD, version);
--- End diff --

needs a comment as said in JIRA why we add the version here


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-11-07 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r231566214
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -259,9 +264,8 @@ public boolean doInPlaceUpdateMerge(AddUpdateCommand 
cmd, Set updatedFie
 SolrInputDocument oldDocument = RealTimeGetComponent.getInputDocument
   (cmd.getReq().getCore(), idBytes,
null, // don't want the version to be returned
-   true, // avoid stored fields from index
updatedFields,
-   true); // resolve the full document
+   RealTimeGetComponent.Resolution.FULL_DOC); // get the partial 
document for the in-place update
--- End diff --

it is not a partial document; the original comment was correct.  Now with 
an enum, we don't even need a comment as the enum has a self explanatory name.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-11-07 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r231567341
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -1164,6 +1225,31 @@ public void processGetUpdates(ResponseBuilder rb) 
throws IOException
 return new ArrayList<>(versionsToRet);
   }
 
+  /**
+   *  
+   *Lookup strategy for {@link #getInputDocument(SolrCore, BytesRef, 
AtomicLong, Set, Resolution)}.
+   *  
+   *  
+   *{@link #FULL_DOC}
+   *{@link #DOC_CHILDREN}
+   *{@link #FULL_HIERARCHY}
+   *  
+   */
+  public static enum Resolution {
--- End diff --

Suggested new names:  "DOC", "DOC_WITH_CHILDREN", "ROOT_WITH_CHILDREN".
Resolution is now only used where there is no "in place" so no need to 
mention partial documents.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-11-06 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r231127727
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -639,12 +651,32 @@ public static SolrInputDocument 
getInputDocument(SolrCore core, BytesRef idBytes
   sid = new SolrInputDocument();
 } else {
   Document luceneDocument = docFetcher.doc(docid);
-  sid = toSolrInputDocument(luceneDocument, 
core.getLatestSchema());
+  sid = toSolrInputDocument(luceneDocument, schema);
 }
-if (onlyTheseNonStoredDVs != null) {
-  docFetcher.decorateDocValueFields(sid, docid, 
onlyTheseNonStoredDVs);
-} else {
-  docFetcher.decorateDocValueFields(sid, docid, 
docFetcher.getNonStoredDVsWithoutCopyTargets());
+ensureDocFieldsDecorated(onlyTheseNonStoredDVs, sid, docid, 
docFetcher, resolveRootDoc ||
+resolveChildren || 
schema.hasExplicitField(IndexSchema.NEST_PATH_FIELD_NAME));
+SolrInputField rootField = 
sid.getField(IndexSchema.ROOT_FIELD_NAME);
+if((resolveChildren || resolveRootDoc) && 
schema.isUsableForChildDocs() && rootField!=null) {
+  // doc is part of a nested structure
+  String id = resolveRootDoc? (String) rootField.getFirstValue(): 
(String) sid.getField(idField.getName()).getFirstValue();
+  ModifiableSolrParams params = new ModifiableSolrParams()
+  .set("fl", "*, _nest_path_, [child]")
+  .set("limit", "-1");
+  SolrQueryRequest nestedReq = new LocalSolrQueryRequest(core, 
params);
+  final BytesRef rootIdBytes = new BytesRef(id);
+  final int rootDocId = searcher.getFirstMatch(new 
Term(idField.getName(), rootIdBytes));
+  final DocTransformer childDocTransformer = 
TransformerFactory.defaultFactories.get("child").create("child", params, 
nestedReq);
--- End diff --

no, use `core.getTransformerFactory("child")`


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #:

2018-10-31 Thread dsmiley
Github user dsmiley commented on the pull request:


https://github.com/apache/lucene-solr/commit/43d7f5d104802bc3281d5995c6c2d71bebf1f369#commitcomment-31121980
  
I understand it's a design decision.  I'm not certain about the correctness 
of the logic; I've been leaving that to you here; DUP is a mess to untangle.  I 
think  your method name is still inappropriate -- for example "if 
(sdoc.hasChildDocuments()) return true" thus even it's not an update, we still 
can return true.  So instead lets name the method based on why the caller cares 
-- it wants to know wether to get the version or not.  Put on DUP.  The method 
is not clean enough or general enough to go on the Cmd.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #:

2018-10-29 Thread dsmiley
Github user dsmiley commented on the pull request:


https://github.com/apache/lucene-solr/commit/43d7f5d104802bc3281d5995c6c2d71bebf1f369#commitcomment-31096103
  
isNestedAtomicUpdate seems to return true even it it isn't an atomic 
update.  So it's name isn't right or the method is doing too much at once (i.e. 
should be split up?).  I'm not sure why it's needed -- I mean I see where 
you're calling it but I don't follow the rationale for its existence.  'course 
the overall code in this class is hideously complicated even before we came 
along so that doesn't help ;-)


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-25 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r228197896
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -609,9 +618,10 @@ public static SolrInputDocument 
getInputDocument(SolrCore core, BytesRef idBytes
* @param resolveFullDocument In case the document is fetched from the 
tlog, it could only be a partial document if the last update
*  was an in-place update. In that case, should this 
partial document be resolved to a full document (by following
*  back prevPointer/prevVersion)?
+   * @param resolveBlock Check whether the document is part of a block. If 
so, return the whole block.
*/
   public static SolrInputDocument getInputDocument(SolrCore core, BytesRef 
idBytes, AtomicLong versionReturned, boolean avoidRetrievingStoredFields,
-  Set onlyTheseNonStoredDVs, boolean resolveFullDocument) 
throws IOException {
+  Set onlyTheseNonStoredDVs, boolean resolveFullDocument, 
boolean resolveBlock) throws IOException {
 SolrInputDocument sid = null;
--- End diff --

@moshebla I'm concerned about this method having so many parameters... it's 
a code smell.
resolveBlock & resolveChildren booleans... not sure how they differ.  Isn't 
resolveChildren enough?  if resolveChildren true, then isn't effectively 
resolveFullDocument also true?  (if so the constraint could be documented and 
enforced with an assertion).  

versionReturned is a dubious parameter I'm skeptical needs to be here.  I'm 
aware you didn't add it, but the number of parameters is getting troublingly 
long with your changes; it's good to review what's needed.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-25 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r228174440
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -262,6 +263,11 @@ private void flattenAnonymous(List 
unwrappedDocs, SolrInputDo
 flattenAnonymous(unwrappedDocs, currentDoc, false);
   }
 
+  public String getRouteFieldVal() {
--- End diff --

What code duplication?

I think we need to standardize/harmonize route a bit.  Notice UpdateCommand 
has setRoute & getRoute.  Lets initialize route in the constructor here by the 
presence of \_route\_ in the params.  Then lets not look for _route_ in params 
later since we know we can get it here.  Then I think 
`org.apache.solr.update.processor.DistributedUpdateProcessor#setupRequest(java.lang.String,
 org.apache.solr.common.SolrInputDocument)` can be removed and instead insist 
everyone call the overloaded version that takes a route, and each caller looks 
up the route from the command.  It's not clear to me if "null" actually should 
be passed to route in any circumstance.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-24 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r228026284
  
--- Diff: solr/core/src/java/org/apache/solr/update/AddUpdateCommand.java 
---
@@ -262,6 +263,11 @@ private void flattenAnonymous(List 
unwrappedDocs, SolrInputDo
 flattenAnonymous(unwrappedDocs, currentDoc, false);
   }
 
+  public String getRouteFieldVal() {
--- End diff --

I'm skeptical of this method.  It's name seems innocent enough looking at 
the code here.  But then also consider some collections have a "router.field" 
and this method is named in such a way that one would think this method returns 
that field's value... yet it does not.  Some callers put this into a variable 
named "id" or similar.  Given that, I propose you remove it but incorporate the 
logic into getHashableId which seems the proper place for it.  It _is_ the 
hashable Id... the hashable ID of a nested doc is it's root.

But... I do also wonder if we need this at all.  Somewhere Solr already has 
code that looks at \_route\_ and acts on that if present.  Perhaps the code 
path for an atomic update isn't doing this yet but should do it?  Then we 
wouldn't need this change to AddUpdateCommand.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #455: SOLR-12638

2018-10-19 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/455
  
As a general point, I feel we should prefer "nested" terminology instead of 
"block".  If we were purely working within Lucene then I think "block" might be 
okay but at the Solr layer people see this stuff as "nested" docs.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-19 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r226709714
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -59,39 +52,147 @@ public void before() {
 
   @Test
   public void testMergeChildDoc() throws Exception {
-SolrInputDocument doc = new SolrInputDocument();
-doc.setField("id", "1");
-doc.setField("cat_ss", new String[]{"aaa", "ccc"});
-doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+
+SolrInputDocument dummyBlock = sdoc("id", "1",
+"cat_ss", new ArrayList<>(Arrays.asList("aaa", "ccc")),
+"_root_", "1", "child", new ArrayList<>(sdocs(addedDoc)));
+dummyBlock.removeField(VERSION);
+
+SolrInputDocument preMergeDoc = new SolrInputDocument(dummyBlock);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, dummyBlock);
+assertEquals("merged document should have the same id", 
preMergeDoc.getFieldValue("id"), dummyBlock.getFieldValue("id"));
+assertDocContainsSubset(preMergeDoc, dummyBlock);
+assertDocContainsSubset(addedDoc, dummyBlock);
+assertDocContainsSubset(newChildDoc, (SolrInputDocument) ((List) 
dummyBlock.getFieldValues("child")).get(1));
+assertEquals(dummyBlock.getFieldValue("id"), 
dummyBlock.getFieldValue("id"));
+  }
+
+  @Test
+  public void testBlockAtomicQuantities() throws Exception {
+SolrInputDocument doc = sdoc("id", "1", "string_s", "root");
 addDoc(adoc(doc), "nested-rtg");
 
-BytesRef rootDocId = new BytesRef("1");
-SolrCore core = h.getCore();
-SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
-// assert block doc has child docs
-assertTrue(block.containsKey("child"));
+List docs = IntStream.range(10, 20).mapToObj(x -> 
sdoc("id", String.valueOf(x), "string_s", 
"child")).collect(Collectors.toList());
+doc = sdoc("id", "1", "children", Collections.singletonMap("add", 
docs));
+addAndGetVersion(doc, params("update.chain", "nested-rtg", "wt", 
"json"));
 
-assertJQ(req("q","id:1")
-,"/response/numFound==0"
+assertU(commit());
+
+assertJQ(req("q", "_root_:1"),
+"/response/numFound==11");
+
+assertJQ(req("q", "string_s:child", "fl", "*", "rows", "100"),
+"/response/numFound==10");
+
+// ensure updates work when block has more than 10 children
+for(int i = 10; i < 20; ++i) {
+  System.out.println("indexing " + i);
+  docs = IntStream.range(i * 10, (i * 10) + 5).mapToObj(x -> 
sdoc("id", String.valueOf(x), "string_s", 
"grandChild")).collect(Collectors.toList());
+  doc = sdoc("id", String.valueOf(i), "grandChildren", 
Collections.singletonMap("add", docs));
+  addAndGetVersion(doc, params("update.chain", "nested-rtg", "wt", 
"json"));
+  assertU(commit());
+}
+
+assertJQ(req("q", "id:114", "fl", "*", "rows", "100"),
--- End diff --

Why set the "fl" or "rows" in these queries?  Your assertion only checks 
numFound and not the content of those that were found.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-19 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r226709264
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -59,39 +52,147 @@ public void before() {
 
   @Test
   public void testMergeChildDoc() throws Exception {
-SolrInputDocument doc = new SolrInputDocument();
-doc.setField("id", "1");
-doc.setField("cat_ss", new String[]{"aaa", "ccc"});
-doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+
+SolrInputDocument dummyBlock = sdoc("id", "1",
+"cat_ss", new ArrayList<>(Arrays.asList("aaa", "ccc")),
+"_root_", "1", "child", new ArrayList<>(sdocs(addedDoc)));
+dummyBlock.removeField(VERSION);
+
+SolrInputDocument preMergeDoc = new SolrInputDocument(dummyBlock);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, dummyBlock);
+assertEquals("merged document should have the same id", 
preMergeDoc.getFieldValue("id"), dummyBlock.getFieldValue("id"));
+assertDocContainsSubset(preMergeDoc, dummyBlock);
+assertDocContainsSubset(addedDoc, dummyBlock);
+assertDocContainsSubset(newChildDoc, (SolrInputDocument) ((List) 
dummyBlock.getFieldValues("child")).get(1));
+assertEquals(dummyBlock.getFieldValue("id"), 
dummyBlock.getFieldValue("id"));
+  }
+
+  @Test
+  public void testBlockAtomicQuantities() throws Exception {
+SolrInputDocument doc = sdoc("id", "1", "string_s", "root");
 addDoc(adoc(doc), "nested-rtg");
 
-BytesRef rootDocId = new BytesRef("1");
-SolrCore core = h.getCore();
-SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
-// assert block doc has child docs
-assertTrue(block.containsKey("child"));
+List docs = IntStream.range(10, 20).mapToObj(x -> 
sdoc("id", String.valueOf(x), "string_s", 
"child")).collect(Collectors.toList());
+doc = sdoc("id", "1", "children", Collections.singletonMap("add", 
docs));
+addAndGetVersion(doc, params("update.chain", "nested-rtg", "wt", 
"json"));
 
-assertJQ(req("q","id:1")
-,"/response/numFound==0"
+assertU(commit());
+
+assertJQ(req("q", "_root_:1"),
+"/response/numFound==11");
+
+assertJQ(req("q", "string_s:child", "fl", "*", "rows", "100"),
+"/response/numFound==10");
+
+// ensure updates work when block has more than 10 children
--- End diff --

Why is 10 special?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224133917
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.component.RealTimeGetComponent;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class AtomicUpdateBlockTest extends SolrTestCaseJ4 {
+
+  private final static String VERSION = "_version_";
+  private static String PREVIOUS_ENABLE_UPDATE_LOG_VALUE;
+
+  @BeforeClass
+  public static void beforeTests() throws Exception {
+PREVIOUS_ENABLE_UPDATE_LOG_VALUE = 
System.getProperty("enable.update.log");
+System.setProperty("enable.update.log", "true");
+initCore("solrconfig-block-atomic-update.xml", "schema-nest.xml"); // 
use "nest" schema
+  }
+
+  @AfterClass
+  public static void afterTests() throws Exception {
+// restore enable.update.log
+System.setProperty("enable.update.log", 
PREVIOUS_ENABLE_UPDATE_LOG_VALUE);
+  }
+
+  @Before
+  public void before() {
+clearIndex();
+assertU(commit());
+  }
+
+  @Test
+  public void testMergeChildDoc() throws Exception {
+SolrInputDocument doc = new SolrInputDocument();
+doc.setField("id", "1");
+doc.setField("cat_ss", new String[]{"aaa", "ccc"});
+doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+addDoc(adoc(doc), "nested-rtg");
+
+BytesRef rootDocId = new BytesRef("1");
+SolrCore core = h.getCore();
+SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
+// assert block doc has child docs
+assertTrue(block.containsKey("child"));
+
+assertJQ(req("q","id:1")
+,"/response/numFound==0"
+);
+
+// commit the changes
+assertU(commit());
+
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+block = RealTimeGetComponent.getInputDocument(core, rootDocId, true);
+block.removeField(VERSION);
+SolrInputDocument preMergeDoc = new SolrInputDocument(block);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, block);
+assertEquals("merged document should have the same id", 
preMergeDoc.getFieldValue("id"), block.getFieldValue("id"));
+assertDocContainsSubset(preMergeDoc, block);
+assertDocContainsSubset(addedDoc, block);
+assertDocContainsSubset(newChildDoc, (SolrInputDocument) ((List) 
block.getFieldValues("child")).get(1));
+assertEquals(doc.getFieldValue("id"), block.getFieldValue("id"));
+  }
+
+  @Test
+  public void testBlockAtom

[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224127559
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.component.RealTimeGetComponent;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class AtomicUpdateBlockTest extends SolrTestCaseJ4 {
+
+  private final static String VERSION = "_version_";
+  private static String PREVIOUS_ENABLE_UPDATE_LOG_VALUE;
+
+  @BeforeClass
+  public static void beforeTests() throws Exception {
+PREVIOUS_ENABLE_UPDATE_LOG_VALUE = 
System.getProperty("enable.update.log");
+System.setProperty("enable.update.log", "true");
--- End diff --

The `solrconfig-block-atomic-update.xml` file is not toggled by this system 
property (perhaps others are).  Why set this system property?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223747918
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -442,5 +442,58 @@ protected void doRemoveRegex(SolrInputDocument toDoc, 
SolrInputField sif, Object
 }
 return patterns;
   }
+
+  private Object getNativeFieldValue(String fieldName, Object val) {
+if(isChildDoc(val)) {
+  return val;
+}
+SchemaField sf = schema.getField(fieldName);
+return sf.getType().toNativeType(val);
+  }
+
+  private static boolean isChildDoc(Object obj) {
+if(!(obj instanceof Collection)) {
+  return obj instanceof SolrDocumentBase;
+}
+Collection objValues = (Collection) obj;
+if(objValues.size() == 0) {
+  return false;
+}
+return objValues.iterator().next() instanceof SolrDocumentBase;
+  }
+
+  private void removeObj(Collection original, Object toRemove, String 
fieldName) {
+if(isChildDoc(toRemove)) {
+  removeChildDoc(original, (SolrInputDocument) toRemove);
+} else {
+  original.remove(getNativeFieldValue(fieldName, toRemove));
+}
+  }
+
+  private static void removeChildDoc(Collection original, 
SolrInputDocument docToRemove) {
+for(SolrInputDocument doc: (Collection) original) {
+  if(isDerivedFromDoc(doc, docToRemove)) {
+original.remove(doc);
+return;
+  }
+}
+  }
+
+  /**
+   *
+   * @param fullDoc the document to be tested
+   * @param subDoc the sub document that should be a subset of fullDoc
+   * @return whether subDoc is a subset of fullDoc
+   */
+  private static boolean isDerivedFromDoc(SolrInputDocument fullDoc, 
SolrInputDocument subDoc) {
+for(SolrInputField subSif: subDoc) {
+  String fieldName = subSif.getName();
+  if(!fullDoc.containsKey(fieldName)) return false;
--- End diff --

This results in a double-lookup of the values with the next line.  Remove 
this line and after the next one simply do a null-check.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224306158
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -639,12 +650,30 @@ public static SolrInputDocument 
getInputDocument(SolrCore core, BytesRef idBytes
   sid = new SolrInputDocument();
 } else {
   Document luceneDocument = docFetcher.doc(docid);
-  sid = toSolrInputDocument(luceneDocument, 
core.getLatestSchema());
+  sid = toSolrInputDocument(luceneDocument, schema);
 }
-if (onlyTheseNonStoredDVs != null) {
-  docFetcher.decorateDocValueFields(sid, docid, 
onlyTheseNonStoredDVs);
-} else {
-  docFetcher.decorateDocValueFields(sid, docid, 
docFetcher.getNonStoredDVsWithoutCopyTargets());
+ensureDocDecorated(onlyTheseNonStoredDVs, sid, docid, docFetcher, 
resolveBlock || schema.hasExplicitField(IndexSchema.NEST_PATH_FIELD_NAME));
+SolrInputField rootField;
+if(resolveBlock && schema.isUsableForChildDocs() && (rootField = 
sid.getField(IndexSchema.ROOT_FIELD_NAME))!=null) {
+  // doc is part of a nested structure
+  ModifiableSolrParams params = new ModifiableSolrParams()
+  .set("q", 
core.getLatestSchema().getUniqueKeyField().getName()+ ":" 
+rootField.getFirstValue())
--- End diff --

It seems the LocalSolrQueryRequest here is a dummy needed to satisfy some 
of the methods below.  This threw me; there should be comments and/or choice of 
var names (e.g. dummyReq) to reflect this.  "q" isn't needed; just the "fl".  
It seems we don't even need the "fl" here since that can be supplied as the 
first parameter to SolrReturnFields, which seems better if it works.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224306892
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1360,24 +1385,47 @@ boolean getUpdatedDocument(AddUpdateCommand cmd, 
long versionOnUpdate) throws IO
 }
 
 // full (non-inplace) atomic update
+final boolean isNestedSchema = req.getSchema().isUsableForChildDocs();
 SolrInputDocument sdoc = cmd.getSolrInputDocument();
 BytesRef id = cmd.getIndexedId();
-SolrInputDocument oldDoc = 
RealTimeGetComponent.getInputDocument(cmd.getReq().getCore(), id);
+SolrInputDocument blockDoc = 
RealTimeGetComponent.getInputDocument(cmd.getReq().getCore(), id, null,
+false, null, true, true);
 
-if (oldDoc == null) {
-  // create a new doc by default if an old one wasn't found
-  if (versionOnUpdate <= 0) {
-oldDoc = new SolrInputDocument();
-  } else {
+if (blockDoc == null) {
+  if (versionOnUpdate > 0) {
 // could just let the optimistic locking throw the error
 throw new SolrException(ErrorCode.CONFLICT, "Document not found 
for update.  id=" + cmd.getPrintableId());
   }
 } else {
-  oldDoc.remove(CommonParams.VERSION_FIELD);
+  blockDoc.remove(CommonParams.VERSION_FIELD);
 }
 
 
-cmd.solrDoc = docMerger.merge(sdoc, oldDoc);
+SolrInputDocument mergedDoc;
+if(idField == null || blockDoc == null) {
+  // create a new doc by default if an old one wasn't found
+  mergedDoc = docMerger.merge(sdoc, new SolrInputDocument());
+} else {
+  if(isNestedSchema && 
req.getSchema().hasExplicitField(IndexSchema.NEST_PATH_FIELD_NAME) &&
+  blockDoc.containsKey(IndexSchema.ROOT_FIELD_NAME) && 
!id.utf8ToString().equals(blockDoc.getFieldValue(IndexSchema.ROOT_FIELD_NAME))) 
{
--- End diff --

I don't think we can assume id.utf8ToString() is correct.  I think we have 
to consult the corresponding FieldType to get the "external value".  Also, cast 
blockDoc.getFieldValue as a String to make it clear we expected it to be one.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223747455
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -442,5 +442,58 @@ protected void doRemoveRegex(SolrInputDocument toDoc, 
SolrInputField sif, Object
 }
 return patterns;
   }
+
+  private Object getNativeFieldValue(String fieldName, Object val) {
+if(isChildDoc(val)) {
+  return val;
+}
+SchemaField sf = schema.getField(fieldName);
+return sf.getType().toNativeType(val);
+  }
+
+  private static boolean isChildDoc(Object obj) {
+if(!(obj instanceof Collection)) {
+  return obj instanceof SolrDocumentBase;
+}
+Collection objValues = (Collection) obj;
+if(objValues.size() == 0) {
+  return false;
+}
+return objValues.iterator().next() instanceof SolrDocumentBase;
+  }
+
+  private void removeObj(Collection original, Object toRemove, String 
fieldName) {
+if(isChildDoc(toRemove)) {
+  removeChildDoc(original, (SolrInputDocument) toRemove);
+} else {
+  original.remove(getNativeFieldValue(fieldName, toRemove));
+}
+  }
+
+  private static void removeChildDoc(Collection original, 
SolrInputDocument docToRemove) {
+for(SolrInputDocument doc: (Collection) original) {
+  if(isDerivedFromDoc(doc, docToRemove)) {
+original.remove(doc);
+return;
+  }
+}
+  }
+
+  /**
+   *
+   * @param fullDoc the document to be tested
+   * @param subDoc the sub document that should be a subset of fullDoc
+   * @return whether subDoc is a subset of fullDoc
+   */
+  private static boolean isDerivedFromDoc(SolrInputDocument fullDoc, 
SolrInputDocument subDoc) {
+for(SolrInputField subSif: subDoc) {
+  String fieldName = subSif.getName();
+  if(!fullDoc.containsKey(fieldName)) return false;
+  Collection fieldValues = fullDoc.getFieldValues(fieldName);
+  if(fieldValues.size() < subSif.getValueCount()) return false;
+  
if(!fullDoc.getFieldValues(fieldName).containsAll(subSif.getValues())) return 
false;
--- End diff --

`fullDoc.getFieldValues(fieldName)` on this line can be replaced by 
`fieldValues` since we already have the values.  And the previous line on the 
count is unnecessary since the containsAll check on this line would fail.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224307768
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1360,24 +1385,47 @@ boolean getUpdatedDocument(AddUpdateCommand cmd, 
long versionOnUpdate) throws IO
 }
 
 // full (non-inplace) atomic update
+final boolean isNestedSchema = req.getSchema().isUsableForChildDocs();
 SolrInputDocument sdoc = cmd.getSolrInputDocument();
 BytesRef id = cmd.getIndexedId();
-SolrInputDocument oldDoc = 
RealTimeGetComponent.getInputDocument(cmd.getReq().getCore(), id);
+SolrInputDocument blockDoc = 
RealTimeGetComponent.getInputDocument(cmd.getReq().getCore(), id, null,
+false, null, true, true);
 
-if (oldDoc == null) {
-  // create a new doc by default if an old one wasn't found
-  if (versionOnUpdate <= 0) {
-oldDoc = new SolrInputDocument();
-  } else {
+if (blockDoc == null) {
+  if (versionOnUpdate > 0) {
 // could just let the optimistic locking throw the error
 throw new SolrException(ErrorCode.CONFLICT, "Document not found 
for update.  id=" + cmd.getPrintableId());
   }
 } else {
-  oldDoc.remove(CommonParams.VERSION_FIELD);
+  blockDoc.remove(CommonParams.VERSION_FIELD);
 }
 
 
-cmd.solrDoc = docMerger.merge(sdoc, oldDoc);
+SolrInputDocument mergedDoc;
+if(idField == null || blockDoc == null) {
+  // create a new doc by default if an old one wasn't found
+  mergedDoc = docMerger.merge(sdoc, new SolrInputDocument());
+} else {
+  if(isNestedSchema && 
req.getSchema().hasExplicitField(IndexSchema.NEST_PATH_FIELD_NAME) &&
+  blockDoc.containsKey(IndexSchema.ROOT_FIELD_NAME) && 
!id.utf8ToString().equals(blockDoc.getFieldValue(IndexSchema.ROOT_FIELD_NAME))) 
{
+SolrInputDocument oldDoc = 
RealTimeGetComponent.getInputDocument(cmd.getReq().getCore(), id, null,
+false, null, true, false);
+mergedDoc = docMerger.merge(sdoc, oldDoc);
+String docPath = (String) 
mergedDoc.getFieldValue(IndexSchema.NEST_PATH_FIELD_NAME);
+List docPaths = StrUtils.splitSmart(docPath, 
PATH_SEP_CHAR);
+SolrInputField replaceDoc = 
blockDoc.getField(docPaths.remove(0).replaceAll(PATH_SEP_CHAR + "|" + 
NUM_SEP_CHAR, ""));
--- End diff --

The logic here (not just this line) is non-obvious to me.  There are no 
comments.  Please add comments and maybe refactor out a method.  The use of 
replaceAll with a regexp is suspicious to me.  None of the tests you added 
triggered a breakpoint within the docPaths loop below.  Needs more testing or 
possible error.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224306195
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -661,6 +690,21 @@ public static SolrInputDocument 
getInputDocument(SolrCore core, BytesRef idBytes
 return sid;
   }
 
+  private static void ensureDocDecorated(Set 
onlyTheseNonStoredDVs, SolrDocumentBase doc, int docid, SolrDocumentFetcher 
docFetcher) throws IOException {
--- End diff --

I suggest renaming these methods `ensureDocDecorated` since it's what it 
calls.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224209409
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.component.RealTimeGetComponent;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class AtomicUpdateBlockTest extends SolrTestCaseJ4 {
+
+  private final static String VERSION = "_version_";
+  private static String PREVIOUS_ENABLE_UPDATE_LOG_VALUE;
+
+  @BeforeClass
+  public static void beforeTests() throws Exception {
+PREVIOUS_ENABLE_UPDATE_LOG_VALUE = 
System.getProperty("enable.update.log");
+System.setProperty("enable.update.log", "true");
+initCore("solrconfig-block-atomic-update.xml", "schema-nest.xml"); // 
use "nest" schema
+  }
+
+  @AfterClass
+  public static void afterTests() throws Exception {
+// restore enable.update.log
+System.setProperty("enable.update.log", 
PREVIOUS_ENABLE_UPDATE_LOG_VALUE);
+  }
+
+  @Before
+  public void before() {
+clearIndex();
+assertU(commit());
+  }
+
+  @Test
+  public void testMergeChildDoc() throws Exception {
+SolrInputDocument doc = new SolrInputDocument();
+doc.setField("id", "1");
+doc.setField("cat_ss", new String[]{"aaa", "ccc"});
+doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+addDoc(adoc(doc), "nested-rtg");
+
+BytesRef rootDocId = new BytesRef("1");
+SolrCore core = h.getCore();
+SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
+// assert block doc has child docs
+assertTrue(block.containsKey("child"));
+
+assertJQ(req("q","id:1")
+,"/response/numFound==0"
+);
+
+// commit the changes
+assertU(commit());
+
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+block = RealTimeGetComponent.getInputDocument(core, rootDocId, true);
+block.removeField(VERSION);
+SolrInputDocument preMergeDoc = new SolrInputDocument(block);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, block);
+assertEquals("merged document should have the same id", 
preMergeDoc.getFieldValue("id"), block.getFieldValue("id"));
+assertDocContainsSubset(preMergeDoc, block);
+assertDocContainsSubset(addedDoc, block);
+assertDocContainsSubset(newChildDoc, (SolrInputDocument) ((List) 
block.getFieldValues("child")).get(1));
+assertEquals(doc.getFieldValue("id"), block.getFieldValue("id"));
+  }
+
+  @Test
+  public void testBlockAtomicAdd() throws Exception {
+
+SolrInputDocument

[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224283996
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -609,9 +618,10 @@ public static SolrInputDocument 
getInputDocument(SolrCore core, BytesRef idBytes
* @param resolveFullDocument In case the document is fetched from the 
tlog, it could only be a partial document if the last update
*  was an in-place update. In that case, should this 
partial document be resolved to a full document (by following
*  back prevPointer/prevVersion)?
+   * @param resolveBlock Check whether the document is part of a block. If 
so, return the whole block.
*/
   public static SolrInputDocument getInputDocument(SolrCore core, BytesRef 
idBytes, AtomicLong versionReturned, boolean avoidRetrievingStoredFields,
-  Set onlyTheseNonStoredDVs, boolean resolveFullDocument) 
throws IOException {
+  Set onlyTheseNonStoredDVs, boolean resolveFullDocument, 
boolean resolveBlock) throws IOException {
 SolrInputDocument sid = null;
--- End diff --

It would be helpful to add a javadoc comment to say that if the id refers 
to a nested document (which isn't known up-front), then it'll never be found in 
the tlog (at least if you follow the rules of nested docs).  Also, perhaps the 
parameter "resolveBlock" should be "resolveToRootDocument" since I think the 
"root" terminology may be more widely used as it's even in the schema, whereas 
"block" is I think not so much.  If you disagree, a compromise may be to use 
both "root" and "Block" together -- "resolveRootBlock".


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224300831
  
--- Diff: 
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java 
---
@@ -639,12 +650,30 @@ public static SolrInputDocument 
getInputDocument(SolrCore core, BytesRef idBytes
   sid = new SolrInputDocument();
 } else {
   Document luceneDocument = docFetcher.doc(docid);
-  sid = toSolrInputDocument(luceneDocument, 
core.getLatestSchema());
+  sid = toSolrInputDocument(luceneDocument, schema);
 }
-if (onlyTheseNonStoredDVs != null) {
-  docFetcher.decorateDocValueFields(sid, docid, 
onlyTheseNonStoredDVs);
-} else {
-  docFetcher.decorateDocValueFields(sid, docid, 
docFetcher.getNonStoredDVsWithoutCopyTargets());
+ensureDocDecorated(onlyTheseNonStoredDVs, sid, docid, docFetcher, 
resolveBlock || schema.hasExplicitField(IndexSchema.NEST_PATH_FIELD_NAME));
+SolrInputField rootField;
--- End diff --

no big deal to simply initialize rootField up front.  You are doing it as 
an expression with a side-effect below which is needlessly awkward.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224130315
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.component.RealTimeGetComponent;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class AtomicUpdateBlockTest extends SolrTestCaseJ4 {
+
+  private final static String VERSION = "_version_";
+  private static String PREVIOUS_ENABLE_UPDATE_LOG_VALUE;
+
+  @BeforeClass
+  public static void beforeTests() throws Exception {
+PREVIOUS_ENABLE_UPDATE_LOG_VALUE = 
System.getProperty("enable.update.log");
+System.setProperty("enable.update.log", "true");
+initCore("solrconfig-block-atomic-update.xml", "schema-nest.xml"); // 
use "nest" schema
+  }
+
+  @AfterClass
+  public static void afterTests() throws Exception {
+// restore enable.update.log
+System.setProperty("enable.update.log", 
PREVIOUS_ENABLE_UPDATE_LOG_VALUE);
+  }
+
+  @Before
+  public void before() {
+clearIndex();
+assertU(commit());
+  }
+
+  @Test
+  public void testMergeChildDoc() throws Exception {
+SolrInputDocument doc = new SolrInputDocument();
+doc.setField("id", "1");
+doc.setField("cat_ss", new String[]{"aaa", "ccc"});
+doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+addDoc(adoc(doc), "nested-rtg");
+
+BytesRef rootDocId = new BytesRef("1");
+SolrCore core = h.getCore();
+SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
+// assert block doc has child docs
+assertTrue(block.containsKey("child"));
+
+assertJQ(req("q","id:1")
+,"/response/numFound==0"
+);
+
+// commit the changes
+assertU(commit());
+
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+block = RealTimeGetComponent.getInputDocument(core, rootDocId, true);
+block.removeField(VERSION);
+SolrInputDocument preMergeDoc = new SolrInputDocument(block);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, block);
--- End diff --

It seems the point of this test is to test 
AtomicUpdateDocumentMerger.merge?  Then why even index anything at all (the 
first half of this test)?  Simply create the documents directly and call the 
merge method and test the result.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223749656
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -111,6 +114,8 @@
   public static final String DISTRIB_FROM = "distrib.from";
   public static final String DISTRIB_INPLACE_PREVVERSION = 
"distrib.inplace.prevversion";
   private static final String TEST_DISTRIB_SKIP_SERVERS = 
"test.distrib.skip.servers";
+  private static final char PATH_SEP_CHAR = '/';
--- End diff --

Please don't create frivolous constants like this.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224131263
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.component.RealTimeGetComponent;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class AtomicUpdateBlockTest extends SolrTestCaseJ4 {
+
+  private final static String VERSION = "_version_";
+  private static String PREVIOUS_ENABLE_UPDATE_LOG_VALUE;
+
+  @BeforeClass
+  public static void beforeTests() throws Exception {
+PREVIOUS_ENABLE_UPDATE_LOG_VALUE = 
System.getProperty("enable.update.log");
+System.setProperty("enable.update.log", "true");
+initCore("solrconfig-block-atomic-update.xml", "schema-nest.xml"); // 
use "nest" schema
+  }
+
+  @AfterClass
+  public static void afterTests() throws Exception {
+// restore enable.update.log
+System.setProperty("enable.update.log", 
PREVIOUS_ENABLE_UPDATE_LOG_VALUE);
+  }
+
+  @Before
+  public void before() {
+clearIndex();
+assertU(commit());
+  }
+
+  @Test
+  public void testMergeChildDoc() throws Exception {
+SolrInputDocument doc = new SolrInputDocument();
+doc.setField("id", "1");
+doc.setField("cat_ss", new String[]{"aaa", "ccc"});
+doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+addDoc(adoc(doc), "nested-rtg");
+
+BytesRef rootDocId = new BytesRef("1");
+SolrCore core = h.getCore();
+SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
+// assert block doc has child docs
+assertTrue(block.containsKey("child"));
+
+assertJQ(req("q","id:1")
+,"/response/numFound==0"
+);
+
+// commit the changes
+assertU(commit());
+
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+block = RealTimeGetComponent.getInputDocument(core, rootDocId, true);
+block.removeField(VERSION);
+SolrInputDocument preMergeDoc = new SolrInputDocument(block);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, block);
+assertEquals("merged document should have the same id", 
preMergeDoc.getFieldValue("id"), block.getFieldValue("id"));
+assertDocContainsSubset(preMergeDoc, block);
+assertDocContainsSubset(addedDoc, block);
+assertDocContainsSubset(newChildDoc, (SolrInputDocument) ((List) 
block.getFieldValues("child")).get(1));
+assertEquals(doc.getFieldValue("id"), block.getFieldValue("id"));
+  }
+
+  @Test
+  public void testBlockAtomicAdd() throws Exception {
+
+ 

[GitHub] lucene-solr pull request #455: SOLR-12638

2018-10-10 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r224131571
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -0,0 +1,370 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.lucene.util.BytesRef;
+import org.apache.solr.SolrTestCaseJ4;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.component.RealTimeGetComponent;
+import org.junit.AfterClass;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class AtomicUpdateBlockTest extends SolrTestCaseJ4 {
+
+  private final static String VERSION = "_version_";
+  private static String PREVIOUS_ENABLE_UPDATE_LOG_VALUE;
+
+  @BeforeClass
+  public static void beforeTests() throws Exception {
+PREVIOUS_ENABLE_UPDATE_LOG_VALUE = 
System.getProperty("enable.update.log");
+System.setProperty("enable.update.log", "true");
+initCore("solrconfig-block-atomic-update.xml", "schema-nest.xml"); // 
use "nest" schema
+  }
+
+  @AfterClass
+  public static void afterTests() throws Exception {
+// restore enable.update.log
+System.setProperty("enable.update.log", 
PREVIOUS_ENABLE_UPDATE_LOG_VALUE);
+  }
+
+  @Before
+  public void before() {
+clearIndex();
+assertU(commit());
+  }
+
+  @Test
+  public void testMergeChildDoc() throws Exception {
+SolrInputDocument doc = new SolrInputDocument();
+doc.setField("id", "1");
+doc.setField("cat_ss", new String[]{"aaa", "ccc"});
+doc.setField("child", Collections.singletonList(sdoc("id", "2", 
"cat_ss", "child")));
+addDoc(adoc(doc), "nested-rtg");
+
+BytesRef rootDocId = new BytesRef("1");
+SolrCore core = h.getCore();
+SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
+// assert block doc has child docs
+assertTrue(block.containsKey("child"));
+
+assertJQ(req("q","id:1")
+,"/response/numFound==0"
+);
+
+// commit the changes
+assertU(commit());
+
+SolrInputDocument newChildDoc = sdoc("id", "3", "cat_ss", "child");
+SolrInputDocument addedDoc = sdoc("id", "1",
+"cat_ss", Collections.singletonMap("add", "bbb"),
+"child", Collections.singletonMap("add", sdocs(newChildDoc)));
+block = RealTimeGetComponent.getInputDocument(core, rootDocId, true);
+block.removeField(VERSION);
+SolrInputDocument preMergeDoc = new SolrInputDocument(block);
+AtomicUpdateDocumentMerger docMerger = new 
AtomicUpdateDocumentMerger(req());
+docMerger.merge(addedDoc, block);
+assertEquals("merged document should have the same id", 
preMergeDoc.getFieldValue("id"), block.getFieldValue("id"));
+assertDocContainsSubset(preMergeDoc, block);
+assertDocContainsSubset(addedDoc, block);
+assertDocContainsSubset(newChildDoc, (SolrInputDocument) ((List) 
block.getFieldValues("child")).get(1));
+assertEquals(doc.getFieldValue("id"), block.getFieldValue("id"));
+  }
+
+  @Test
+  public void testBlockAtomicAdd() throws Exception {
+
+SolrInputDocument doc = sdoc("id", "1",
+"cat_ss", new String[] {"aaa", "ccc"},
+"child1", sdoc("id", "2", "cat_ss", "child")
+);
+json(doc);
--- End diff --

accidentally added this json line?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: WIP: SOLR-12638

2018-10-08 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223561565
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -461,5 +466,33 @@ private static boolean isChildDoc(Object obj) {
 }
 return objValues.iterator().next() instanceof SolrDocumentBase;
   }
+
+  private void removeObj(Collection original, Object toRemove, String 
fieldName) {
+if(isChildDoc(toRemove)) {
+  removeChildDoc(original, (SolrInputDocument) toRemove);
+} else {
+  original.remove(getNativeFieldValue(fieldName, toRemove));
+}
+  }
+
+  private static void removeChildDoc(Collection original, 
SolrInputDocument docToRemove) {
+for(SolrInputDocument doc: (Collection) original) {
+  if(isDerivedFromDoc(doc, docToRemove)) {
+original.remove(doc);
+return;
+  }
+}
+  }
+
+  private static boolean isDerivedFromDoc(SolrInputDocument fullDoc, 
SolrInputDocument subDoc) {
--- End diff --

I see the new docs; still it's unclear without the context of our 
conversation.  Should "subDoc" be "partialDoc"?  How is it that the fullDoc is 
"the document to be tested" when the return value is "wether subDoc is a subset 
of fullDoc"?  Isn't the subject of the test subDoc?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: WIP: SOLR-12638

2018-10-07 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223244019
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -461,5 +466,33 @@ private static boolean isChildDoc(Object obj) {
 }
 return objValues.iterator().next() instanceof SolrDocumentBase;
   }
+
+  private void removeObj(Collection original, Object toRemove, String 
fieldName) {
+if(isChildDoc(toRemove)) {
+  removeChildDoc(original, (SolrInputDocument) toRemove);
+} else {
+  original.remove(getNativeFieldValue(fieldName, toRemove));
+}
+  }
+
+  private static void removeChildDoc(Collection original, 
SolrInputDocument docToRemove) {
+for(SolrInputDocument doc: (Collection) original) {
+  if(isDerivedFromDoc(doc, docToRemove)) {
+original.remove(doc);
+return;
+  }
+}
+  }
+
+  private static boolean isDerivedFromDoc(SolrInputDocument fullDoc, 
SolrInputDocument subDoc) {
--- End diff --

Was does it mean to be "derived" here?  Perhaps do you mean that subDoc is 
a partial update?  Javadocs would help.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: WIP: SOLR-12638

2018-10-07 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223243871
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -461,5 +466,33 @@ private static boolean isChildDoc(Object obj) {
 }
 return objValues.iterator().next() instanceof SolrDocumentBase;
   }
+
+  private void removeObj(Collection original, Object toRemove, String 
fieldName) {
+if(isChildDoc(toRemove)) {
+  removeChildDoc(original, (SolrInputDocument) toRemove);
+} else {
+  original.remove(getNativeFieldValue(fieldName, toRemove));
+}
+  }
+
+  private static void removeChildDoc(Collection original, 
SolrInputDocument docToRemove) {
+for(SolrInputDocument doc: (Collection) original) {
+  if(isDerivedFromDoc(doc, docToRemove)) {
+original.remove(doc);
--- End diff --

eh... I'm not sure if it's as simple as `original.remove(doc)` since if it 
was, we wouldn't need the method `isDerivedFromDoc`.  Assuming we do need that 
"derived" predicate method, then we probably need to call remove on an iterator 
here.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: WIP: SOLR-12638

2018-10-07 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223244909
  
--- Diff: 
solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateBlockTest.java 
---
@@ -181,6 +182,180 @@ public void testBlockRealTimeGet() throws Exception {
 );
   }
 
+  @Test
+  public void testBlockAtomicSet() throws Exception {
+SolrInputDocument doc = sdoc("id", "1",
+"cat_ss", new String[] {"aaa", "ccc"},
+"child1", Collections.singleton(sdoc("id", "2", "cat_ss", "child"))
+);
+json(doc);
+addDoc(adoc(doc), "nested-rtg");
+
+BytesRef rootDocId = new BytesRef("1");
+SolrCore core = h.getCore();
+SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, 
rootDocId, true);
+// assert block doc has child docs
+assertTrue(block.containsKey("child1"));
+
+assertJQ(req("q","id:1")
+,"/response/numFound==0"
+);
+
+// commit the changes
+assertU(commit());
+
+SolrInputDocument committedBlock = 
RealTimeGetComponent.getInputDocument(core, rootDocId, true);
+BytesRef childDocId = new BytesRef("2");
+// ensure the whole block is returned when resolveBlock is true and id 
of a child doc is provided
+assertEquals(committedBlock.toString(), 
RealTimeGetComponent.getInputDocument(core, childDocId, true).toString());
+
+assertJQ(req("q","id:1")
+,"/response/numFound==1"
+);
+
+assertJQ(req("qt","/get", "id","1", "fl","id, cat_ss, child1, [child]")
+,"=={\"doc\":{'id':\"1\"" +
+", cat_ss:[\"aaa\",\"ccc\"], 
child1:[{\"id\":\"2\",\"cat_ss\":[\"child\"]}]" +
+"   }}"
+);
+
+assertU(commit());
+
+assertJQ(req("qt","/get", "id","1", "fl","id, cat_ss, child1, [child]")
+,"=={\"doc\":{'id':\"1\"" +
+", cat_ss:[\"aaa\",\"ccc\"], 
child1:[{\"id\":\"2\",\"cat_ss\":[\"child\"]}]" +
+"   }}"
+);
+
+doc = sdoc("id", "1",
+"cat_ss", ImmutableMap.of("set", Arrays.asList("aaa", "bbb")),
--- End diff --

Minor: in cases where the JDK has alternatives to Guava, use them.  Here, 
it's `Collections.singletonMap` I recall.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: WIP: SOLR-12638

2018-10-07 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r223244186
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java
 ---
@@ -461,5 +466,33 @@ private static boolean isChildDoc(Object obj) {
 }
 return objValues.iterator().next() instanceof SolrDocumentBase;
   }
+
+  private void removeObj(Collection original, Object toRemove, String 
fieldName) {
+if(isChildDoc(toRemove)) {
+  removeChildDoc(original, (SolrInputDocument) toRemove);
+} else {
+  original.remove(getNativeFieldValue(fieldName, toRemove));
+}
+  }
+
+  private static void removeChildDoc(Collection original, 
SolrInputDocument docToRemove) {
+for(SolrInputDocument doc: (Collection) original) {
+  if(isDerivedFromDoc(doc, docToRemove)) {
+original.remove(doc);
+return;
+  }
+}
+  }
+
+  private static boolean isDerivedFromDoc(SolrInputDocument fullDoc, 
SolrInputDocument subDoc) {
+for(SolrInputField subSif: subDoc) {
+  String fieldName = subSif.getName();
+  if(!fullDoc.containsKey(fieldName)) return false;
+  Collection fieldValues = subDoc.getFieldValues(fieldName);
--- End diff --

Can't we get this directly off the subSif via subSif.getValues()?  And why 
would subSif.getValueCount ever be different than fieldValues.size()?  Comments 
would help.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #455: WIP: SOLR-12638

2018-09-17 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/455#discussion_r218293921
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
 ---
@@ -1184,7 +1196,16 @@ protected boolean versionAdd(AddUpdateCommand cmd) 
throws IOException {
 
 // TODO: possibly set checkDeleteByQueries as a flag on the 
command?
 doLocalAdd(cmd);
-
+
+if(lastKnownVersion != null && 
req.getSchema().isUsableForChildDocs() &&
--- End diff --

This may very well be right but can you tell me why you added this delete 
here at this line and why the delete is version-dependent?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #455: WIP: SOLR-12638

2018-09-17 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/455
  
ConvertedLegacyTest fails in part because line 306 adds with 
overwrite=false -- a bit of a dubious Solr feature that is probably not 
properly compatible with the UpdateLog which makes various assumptions about 
the uniqueKey being unique.  I'll email the dev list to see what others think 
but I'm inclined to think overwrite=false ought to be explicitly forbidden with 
an UpdateLog in place.  That ancient legacy test can use a config that doesn't 
have an updateLog.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-03 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214773746
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -132,6 +134,12 @@ public void transform(SolrDocument rootDoc, int 
rootDocId) {
 
   // load the doc
   SolrDocument doc = searcher.getDocFetcher().solrDoc(docId, 
childReturnFields);
+  if(childReturnFields.getTransformer() != null) {
+if(childReturnFields.getTransformer().context != this.context) 
{
--- End diff --

maybe this line should only check if context is null?  By checking if the 
context is different, it's suggestive there may already be a context but if 
that were true then we ought to reinstate the former context when done.  I 
suspect it's always going to be null.  Can you look?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-03 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214773781
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
 ---
@@ -106,9 +123,20 @@ public DocTransformer create(String field, SolrParams 
params, SolrQueryRequest r
   }
 }
 
+String childReturnFields = params.get("fl");
+SolrReturnFields childSolrReturnFields;
+if(childReturnFields != null) {
+  childSolrReturnFields = new SolrReturnFields(childReturnFields, req);
+} else if(req.getSchema().getDefaultLuceneMatchVersion().major < 8) {
--- End diff --

Nice


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214541964
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
 ---
@@ -57,12 +58,34 @@
 
   static final char PATH_SEP_CHAR = '/';
   static final char NUM_SEP_CHAR = '#';
+  private static final ThreadLocal transformerInitialized = new 
ThreadLocal(){
--- End diff --

This is the classic but more verbose way to declare a ThreadLocal.  In Java 
8 this can be a one-liner:  
```ThreadLocal recursionCheckThreadLocal = 
ThreadLocal.withInitial(() -> Boolean.FALSE);```
And please consider another field name, not "transformerInitialized"?  I 
suggest "recursionCheckThreadLocal" since it declares it's purpose (to detect 
recursion) and it's type (it's a ThreadLocal).  "initialized" is dubious to me 
because it suggests an instance field of a class but this one is global.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214542527
  
--- Diff: 
solr/core/src/test/org/apache/solr/response/transform/TestChildDocTransformer.java
 ---
@@ -243,11 +250,39 @@ private void testChildDocNonStoredDVFields() throws 
Exception {
 "fl", "*,[child parentFilter=\"subject:parentDocument\"]"), test1);
 
 assertJQ(req("q", "*:*", "fq", "subject:\"parentDocument\" ",
-"fl", "subject,[child parentFilter=\"subject:parentDocument\" 
childFilter=\"title:foo\"]"), test2);
+"fl", "intDvoDefault, subject,[child 
parentFilter=\"subject:parentDocument\" childFilter=\"title:foo\"]"), test2);
 
 assertJQ(req("q", "*:*", "fq", "subject:\"parentDocument\" ",
-"fl", "subject,[child parentFilter=\"subject:parentDocument\" 
childFilter=\"title:bar\" limit=2]"), test3);
+"fl", "intDvoDefault, subject,[child 
parentFilter=\"subject:parentDocument\" childFilter=\"title:bar\" limit=2]"), 
test3);
+
+  }
+
+  private void testChildReturnFields() throws Exception {
 
+assertJQ(req("q", "*:*", "fq", "subject:\"parentDocument\" ",
+"fl", "*,[child parentFilter=\"subject:parentDocument\" 
fl=\"intDvoDefault,child_fl:[value v='child_fl_test']\"]"),
+"/response/docs/[0]/intDefault==42",
+"/response/docs/[0]/_childDocuments_/[0]/intDvoDefault==42",
+
"/response/docs/[0]/_childDocuments_/[0]/child_fl=='child_fl_test'");
+
+try(SolrQueryRequest req = req("q", "*:*", "fq", 
"subject:\"parentDocument\" ",
--- End diff --

Do you know XPath?  assertQ takes xpath expressions that is powerful enough 
to do everything you are asserting here in less code.  (assertQ is used in a 
*ton* of tests; plenty of examples to learn from).  The assertJQ thing is less 
powerful.  You don't *have* to rewrite it but it's at least worth being aware 
for future assertions.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214542110
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
 ---
@@ -57,12 +58,34 @@
 
   static final char PATH_SEP_CHAR = '/';
   static final char NUM_SEP_CHAR = '#';
+  private static final ThreadLocal transformerInitialized = new 
ThreadLocal(){
+@Override
+protected Boolean initialValue() {
+  this.set(false);
+  return false;
+}
+  };
   private static final BooleanQuery rootFilter = new BooleanQuery.Builder()
   .add(new BooleanClause(new MatchAllDocsQuery(), 
BooleanClause.Occur.MUST))
   .add(new BooleanClause(new 
DocValuesFieldExistsQuery(NEST_PATH_FIELD_NAME), 
BooleanClause.Occur.MUST_NOT)).build();
 
   @Override
   public DocTransformer create(String field, SolrParams params, 
SolrQueryRequest req) {
+if(transformerInitialized.get()) {
+  // this is a recursive call by SolrReturnFields see 
#createChildDocTransformer
+  return new DocTransformer.NoopFieldTransformer();
+} else {
+  try {
+// transformer is yet to be initialized in this thread, create it
+transformerInitialized.set(true);
+return createChildDocTransformer(field, params, req);
+  } finally {
+transformerInitialized.remove();
--- End diff --

Hmm; maybe clearer to set(false)


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214541721
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -54,24 +54,26 @@
   private final DocSet childDocSet;
   private final int limit;
   private final boolean isNestedSchema;
+  private final SolrReturnFields childReturnFields;
 
-  //TODO ought to be provided/configurable
-  private final SolrReturnFields childReturnFields = new 
SolrReturnFields();
-
-  ChildDocTransformer(String name, BitSetProducer parentsFilter,
-  DocSet childDocSet, boolean isNestedSchema, int 
limit) {
+  ChildDocTransformer(String name, BitSetProducer parentsFilter, DocSet 
childDocSet,
+  SolrReturnFields returnFields, boolean 
isNestedSchema, int limit) {
 this.name = name;
 this.parentsFilter = parentsFilter;
 this.childDocSet = childDocSet;
 this.limit = limit;
 this.isNestedSchema = isNestedSchema;
+this.childReturnFields = returnFields!=null? returnFields: new 
SolrReturnFields();
   }
 
   @Override
   public String getName()  {
 return name;
   }
 
+  @Override
+  public boolean needsSolrIndexSearcher() { return true; }
--- End diff --

+1 nice catch! 


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214542721
  
--- Diff: 
solr/solrj/src/test/org/apache/solr/client/solrj/SolrExampleTests.java ---
@@ -1831,7 +1831,7 @@ public void testChildDoctransformer() throws 
IOException, SolrServerException {
   
   q = new SolrQuery("q", "*:*", "indent", "true");
   q.setFilterQueries(parentFilter);
-  q.setFields("id,[child parentFilter=\"" + parentFilter +
+  q.setFields("id, level_i, [child parentFilter=\"" + parentFilter +
--- End diff --

I'm confused; this change suggests there is a backwards-compatibility 
change; is there?  I think you said that the child "fl" is effectively 
constrained by whatever the top "fl" is; which if totally true means (I think) 
we needn't worry by defaulting to the top "fl".  If this isn't totally correct 
(hence your change) then we need to be careful about changing the default 
behavior in a minor release; we'd have to tweak the default choice (absence of 
specific "fl") using the luceneMatchVersion.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214542262
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/DocTransformer.java ---
@@ -114,4 +114,27 @@ public void transform(SolrDocument doc, int docid, 
float score) throws IOExcepti
   public String toString() {
 return getName();
   }
+
+  /**
+   * Trivial Impl that ensure that the specified field is requested as an 
"extra" field,
+   * but then does nothing during the transformation phase.
+   */
+  public static final class NoopFieldTransformer extends DocTransformer {
--- End diff --

Oh woops; I recommended you do this but overlooked it's not *just* a no-op; 
it's doing this extra field requesting task.  Well I suppose it's okay; not a 
big deal.  Alternatively, this could changed to be *just* a no-op, and then the 
former impl in RawValueTransformerFactory.java could exist as a subclass that 
is a bit simpler.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214542325
  
--- Diff: 
solr/core/src/test/org/apache/solr/response/transform/TestChildDocTransformer.java
 ---
@@ -91,10 +98,10 @@ private void testChildDoctransformerXML() throws 
Exception {
 "fl", "*,[child parentFilter=\"subject:parentDocument\"]"), test1);
 
 assertQ(req("q", "*:*", "fq", "subject:\"parentDocument\" ",
-"fl", "subject,[child parentFilter=\"subject:parentDocument\" 
childFilter=\"title:foo\"]"), test2);
+"fl", "id, subject,[child parentFilter=\"subject:parentDocument\" 
childFilter=\"title:foo\"]"), test2);
--- End diff --

nitpick: remove that extra space after the comma here and in the other 
lines you modified.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-09-02 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214541692
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -131,6 +131,12 @@ public void transform(SolrDocument rootDoc, int 
rootDocId) {
 
   // load the doc
   SolrDocument doc = searcher.getDocFetcher().solrDoc(docId, 
childReturnFields);
--- End diff --

I was thinking the same thing!  But yeah; out of scope for this PR.  CC 
@ErickErickson 


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #433: SOLR-12357 Premptive creation of collections ...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/433#discussion_r214121353
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
 ---
@@ -198,32 +209,32 @@ private String createCollectionsIfRequired(Instant 
docTimestamp, String targetCo
 // probably don't write to the same alias. As such, we have deferred 
any solution to the "many clients causing
 // collection creation simultaneously" problem until such time as 
someone actually has that problem in a
 // real world use case that isn't just an anti-pattern.
+Map.Entry candidateCollectionDesc = 
findCandidateGivenTimestamp(docTimestamp, cmd.getPrintableId());
+String candidateCollectionName = candidateCollectionDesc.getValue();
 try {
-  CreationType creationType = requiresCreateCollection(docTimestamp, 
timeRoutedAlias.getPreemptiveCreateWindow());
-  switch (creationType) {
+  switch (typeOfCreationRequired(docTimestamp, 
candidateCollectionDesc.getKey())) {
 case SYNCHRONOUS:
   // This next line blocks until all collections required by the 
current document have been created
-  return maintain(targetCollection, docTimestamp, printableId, 
false);
+  return createAllRequiredCollections(docTimestamp, printableId, 
candidateCollectionDesc);
 case ASYNC_PREEMPTIVE:
-  // Note: creating an executor and throwing it away is slightly 
expensive, but this is only likely to happen
-  // once per hour/day/week (depending on time slice size for the 
TRA). If the executor were retained, it
-  // would need to be shut down in a close hook to avoid test 
failures due to thread leaks which is slightly
-  // more complicated from a code maintenance and readability 
stand point. An executor must used instead of a
-  // thread to ensure we pick up the proper MDC logging stuff from 
ExecutorUtil. T
   if (preemptiveCreationExecutor == null) {
-DefaultSolrThreadFactory threadFactory = new 
DefaultSolrThreadFactory("TRA-preemptive-creation");
-preemptiveCreationExecutor = 
newMDCAwareSingleThreadExecutor(threadFactory);
-preemptiveCreationExecutor.execute(() -> {
-  maintain(targetCollection, docTimestamp, printableId, true);
-  preemptiveCreationExecutor.shutdown();
-  preemptiveCreationExecutor = null;
-});
+// It's important not to add code between here and the prior 
call to findCandidateGivenTimestamp()
+// in processAdd() that invokes 
updateParsedCollectionAliases(). Doing so would update parsedCollectionsDesc
+// and create a race condition. We are relying on the fact 
that get(0) is returning the head of the parsed
+// collections that existed when candidateCollectionDesc was 
created. If this class updates it's notion of
+// parsedCollectionsDesc since candidateCollectionDesc was 
chosen, we could create collection n+2
+// instead of collection n+1.
+
+// This line does not block and the document can be added 
immediately
+preemptiveAsync(() -> 
createNextCollection(this.parsedCollectionsDesc.get(0).getValue()));
--- End diff --

bug: do not refer to this.parsedCollectionsDesc from the lambda runnable as 
it is not thread-safe (and I don't think we should try to make it so.  the 
input should be gathered in the immediate line prior (thus in the main thread) 
so that it's already resolved.

To make this bug harder (you did it twice), you could change 
preemptiveAsync() to take the collection name instead of taking a lambda.  It 
could then be named createNextCollectionAsync


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #433: SOLR-12357 Premptive creation of collections ...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/433#discussion_r214122537
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
 ---
@@ -165,31 +165,42 @@ private String getAliasName() {
 
   @Override
   public void processAdd(AddUpdateCommand cmd) throws IOException {
-SolrInputDocument solrInputDocument = cmd.getSolrInputDocument();
-final Object routeValue = 
solrInputDocument.getFieldValue(timeRoutedAlias.getRouteField());
-final Instant docTimestampToRoute = parseRouteKey(routeValue);
-updateParsedCollectionAliases();
-String candidateCollection = 
findCandidateCollectionGivenTimestamp(docTimestampToRoute, 
cmd.getPrintableId());
-final Instant maxFutureTime = 
Instant.now().plusMillis(timeRoutedAlias.getMaxFutureMs());
+final Instant docTimestamp =
+
parseRouteKey(cmd.getSolrInputDocument().getFieldValue(timeRoutedAlias.getRouteField()));
+
 // TODO: maybe in some cases the user would want to ignore/warn 
instead?
-if (docTimestampToRoute.isAfter(maxFutureTime)) {
+if 
(docTimestamp.isAfter(Instant.now().plusMillis(timeRoutedAlias.getMaxFutureMs(
 {
   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
-  "The document's time routed key of " + docTimestampToRoute + " 
is too far in the future given " +
+  "The document's time routed key of " + docTimestamp + " is too 
far in the future given " +
   TimeRoutedAlias.ROUTER_MAX_FUTURE + "=" + 
timeRoutedAlias.getMaxFutureMs());
 }
-String targetCollection = 
createCollectionsIfRequired(docTimestampToRoute, candidateCollection, 
cmd.getPrintableId());
+
+// to avoid potential for race conditions, this next method should not 
get called again unless
+// we have created a collection synchronously
+updateParsedCollectionAliases();
+
+String targetCollection = createCollectionsIfRequired(docTimestamp, 
cmd.getPrintableId(), cmd);
+
 if (thisCollection.equals(targetCollection)) {
   // pass on through; we've reached the right collection
   super.processAdd(cmd);
 } else {
   // send to the right collection
-  SolrCmdDistributor.Node targetLeaderNode = 
routeDocToSlice(targetCollection, solrInputDocument);
+  SolrCmdDistributor.Node targetLeaderNode = 
routeDocToSlice(targetCollection, cmd.getSolrInputDocument());
   cmdDistrib.distribAdd(cmd, 
Collections.singletonList(targetLeaderNode), new 
ModifiableSolrParams(outParamsToLeader));
 }
   }
 
-
-  private String createCollectionsIfRequired(Instant docTimestamp, String 
targetCollection, String printableId) {
+  /**
+   * Create any required collections and return the name of the collection 
to which the current document should be sent.
+   *
+   * @param docTimestamp the date for the document taken from the field 
specified in the TRA config
--- End diff --

These parameter-level docs are fine but FYI don't feel that you have to do 
this, especially for private methods, and especially for obvious parameters 
(i.e. you don't *have* to document all; you could just do some at your 
discretion, or none).  This is subjective of course; I'm sharing my opinion and 
not insisting on anything.  My preference leans towards... "if you having 
something helpful to say then say it, but avoid writing the obvious" (for 
various reasons)

Oh look... you don't need to pass printableId since you could get it from 
AddUpdateCommand if needed.  printableId is a wart on these method signatures 
IMO.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #433: SOLR-12357 Premptive creation of collections ...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/433#discussion_r214120233
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
 ---
@@ -437,45 +461,46 @@ protected void doClose() {
 
 
   /**
-   * Create as many collections as required. This method loops to allow 
for the possibility that the routeTimestamp
+   * Create as many collections as required. This method loops to allow 
for the possibility that the docTimestamp
* requires more than one collection to be created. Since multiple 
threads may be invoking maintain on separate
* requests to the same alias, we must pass in the name of the 
collection that this thread believes to be the most
* recent collection. This assumption is checked when the command is 
executed in the overseer. When this method
* finds that all collections required have been created it returns the 
(possibly new) most recent collection.
* The return value is ignored by the calling code in the async 
preemptive case.
*
-   * @param targetCollection the initial notion of the latest collection 
available.
* @param docTimestamp the timestamp from the document that determines 
routing
* @param printableId an identifier for the add command used in error 
messages
+   * @param targetCollectionDesc the descriptor for the presently selected 
collection which should also be
+   * the most recent collection in all cases 
where this method is invoked.
* @return The latest collection, including collections created during 
maintenance
*/
-  public String maintain(String targetCollection, Instant docTimestamp, 
String printableId, boolean asyncSinglePassOnly) {
-do { // typically we don't loop; it's only when we need to create a 
collection
-
-  // Note: This code no longer short circuits immediately when it sees 
that the expected latest
-  // collection is the current latest collection. With the advent of 
preemptive collection creation
-  // we always need to do the time based checks. Otherwise, we cannot 
handle the case where the
-  // preemptive window is larger than our TRA's time slices
-
-  // Check the doc isn't too far in the future
-
-  if (NONE == requiresCreateCollection(docTimestamp, 
timeRoutedAlias.getPreemptiveCreateWindow()))
-return targetCollection; // thus we don't need another collection
+  private String createAllRequiredCollections( Instant docTimestamp, 
String printableId,
+   Map.Entry 
targetCollectionDesc) {
+do {
+  switch(typeOfCreationRequired(docTimestamp, 
targetCollectionDesc.getKey())) {
+case NONE:
+  return targetCollectionDesc.getValue(); // we don't need another 
collection
+case ASYNC_PREEMPTIVE:
+  // can happen when preemptive interval is longer than one time 
slice
+  preemptiveAsync(() -> 
createNextCollection(this.parsedCollectionsDesc.get(0).getValue()));
--- End diff --

bug: do not refer to this.parsedCollectionsDesc from the lambda runnable as 
it is not thread-safe (and I don't think we should try to make it so.  the 
input should be gathered in the immediate line prior (thus in the main thread) 
so that it's already resolved.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #433: SOLR-12357 Premptive creation of collections ...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/433#discussion_r214118388
  
--- Diff: 
solr/core/src/java/org/apache/solr/update/processor/TimeRoutedAliasUpdateProcessor.java
 ---
@@ -437,45 +461,46 @@ protected void doClose() {
 
 
   /**
-   * Create as many collections as required. This method loops to allow 
for the possibility that the routeTimestamp
+   * Create as many collections as required. This method loops to allow 
for the possibility that the docTimestamp
* requires more than one collection to be created. Since multiple 
threads may be invoking maintain on separate
* requests to the same alias, we must pass in the name of the 
collection that this thread believes to be the most
* recent collection. This assumption is checked when the command is 
executed in the overseer. When this method
* finds that all collections required have been created it returns the 
(possibly new) most recent collection.
* The return value is ignored by the calling code in the async 
preemptive case.
*
-   * @param targetCollection the initial notion of the latest collection 
available.
* @param docTimestamp the timestamp from the document that determines 
routing
* @param printableId an identifier for the add command used in error 
messages
+   * @param targetCollectionDesc the descriptor for the presently selected 
collection which should also be
+   * the most recent collection in all cases 
where this method is invoked.
* @return The latest collection, including collections created during 
maintenance
*/
-  public String maintain(String targetCollection, Instant docTimestamp, 
String printableId, boolean asyncSinglePassOnly) {
-do { // typically we don't loop; it's only when we need to create a 
collection
-
-  // Note: This code no longer short circuits immediately when it sees 
that the expected latest
-  // collection is the current latest collection. With the advent of 
preemptive collection creation
-  // we always need to do the time based checks. Otherwise, we cannot 
handle the case where the
-  // preemptive window is larger than our TRA's time slices
-
-  // Check the doc isn't too far in the future
-
-  if (NONE == requiresCreateCollection(docTimestamp, 
timeRoutedAlias.getPreemptiveCreateWindow()))
-return targetCollection; // thus we don't need another collection
+  private String createAllRequiredCollections( Instant docTimestamp, 
String printableId,
+   Map.Entry 
targetCollectionDesc) {
+do {
+  switch(typeOfCreationRequired(docTimestamp, 
targetCollectionDesc.getKey())) {
+case NONE:
+  return targetCollectionDesc.getValue(); // we don't need another 
collection
+case ASYNC_PREEMPTIVE:
+  // can happen when preemptive interval is longer than one time 
slice
+  preemptiveAsync(() -> 
createNextCollection(this.parsedCollectionsDesc.get(0).getValue()));
+  return targetCollectionDesc.getValue();
+case SYNCHRONOUS:
+  createNextCollection(targetCollectionDesc.getValue()); // 
*should* throw if fails for some reason but...
+  if (!updateParsedCollectionAliases()) { // thus we didn't make 
progress...
+// this is not expected, even in known failure cases, but we 
check just in case
+throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
+"We need to create a new time routed collection but for 
unknown reasons were unable to do so.");
+  }
+  // then retry the loop ... have to do find again in case other 
requests also added collections
+  // that were made visible when we called 
updateParsedCollectionAliases()
+  targetCollectionDesc = findCandidateGivenTimestamp(docTimestamp, 
printableId);
+  break;
+default:
+  throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, 
"Unknown creation type while adding " +
--- End diff --

I wouldn't waste your virtual breath (lines of code with thoughtful 
explanation) on a default case of an enum switch.Another approach I like is 
to `assert ENUMNAME.values().length == 3;` before the switch which will be 
caught at test time.  I forget but if java still insists we have a default then 
throw IllegalStateException or something like that in a one-liner without 
explaination.  (keep it short & sweet)


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #443: SOLR-12722: add fl param to ChildDocTransformer

2018-08-30 Thread dsmiley
Github user dsmiley commented on the issue:

https://github.com/apache/lucene-solr/pull/443
  
Oh right; we'd get an infinite loop as the ChildDocTransformerFactory is 
already being called by SolrReturnFields!  Ouch!

Sorry but I don't like your solution of trying to parse/strip the child 
transformer "fl"; it's very error-prone to do such things.

Lets define a ThreadLocal that allows us to know if we're being 
asked to create this transformer from within a recursive call to 
SolrReturnFields.  When we see this happening, we can return a no-op 
transformer (see RawValueTransformerFactory.NoopFieldTransformer which could be 
made public and moved to an inner class of DocTransformer).  This thread local 
would be set & cleared in a try-finally, and this logic would be in some 
utility method to keep it from being distracting to the caller.  In general I 
avoid ThreadLocals as it's a singleton design pattern which is usually bad, but 
it can be used sparingly/judiciously.

BTW it appears that doc transformers will not be executed on the child 
docs.  The transformer is not invoking them, and I suspect they won't be 
invoked on the way out to the client.  Can you check/test if this is true?  If 
I'm correct, this is a pre-existing bug/limitation that would be good to mark 
as a TODO; or you can try to address here since it's related (since you before 
couldn't specify what 'fl' was therefore it wasn't possible to specify a 
transformer until now).


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #:

2018-08-30 Thread dsmiley
Github user dsmiley commented on the pull request:


https://github.com/apache/lucene-solr/commit/da0856f1dc41b7ea85ce71801900d63c06422db2#commitcomment-30344425
  
In 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java:
In 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
 on line 111:
BTW the pre-parsed returnFields is on the SolrQueryResponse.  If it's 
inconvenient to get it then you can just do what you have here.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214044533
  
--- Diff: solr/solr-ref-guide/src/transforming-result-documents.adoc ---
@@ -137,6 +137,7 @@ When using this transformer, the `parentFilter` 
parameter must be specified, and
 
 * `childFilter` - query to filter which child documents should be 
included, this can be particularly useful when you have multiple levels of 
hierarchical documents (default: all children)
 * `limit` - the maximum number of child documents to be returned per 
parent document (default: 10)
+* `fl` - the field list which the transformer is to return.
--- End diff --

... "There is a further limitation in which the fields here should be a 
subset of those specified by the top level fl param."  And say defaults to the 
top level "fl" ?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214044911
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
 ---
@@ -106,9 +107,12 @@ public DocTransformer create(String field, SolrParams 
params, SolrQueryRequest r
   }
 }
 
+String childReturnFields = params.get("fl");
+SolrReturnFields childSolrReturnFields = childReturnFields==null? new 
SolrReturnFields(): new SolrReturnFields(childReturnFields ,req);
--- End diff --

Shouldn't this default to the top level "fl"?  I think so.  Even if it 
didn't do this before, it in-effect behaves this way and it makes sense (to me) 
that it would.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #443: SOLR-12722: add fl param to ChildDocTransform...

2018-08-30 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/443#discussion_r214018618
  
--- Diff: 
solr/core/src/test/org/apache/solr/response/transform/TestChildDocTransformerHierarchy.java
 ---
@@ -297,6 +297,35 @@ public void testNoChildren() throws Exception {
 "/response/docs/[0]/type_s==cake");
   }
 
+  @Test
+  public void testChildReturnFields() throws Exception {
+indexSampleData(numberOfDocsPerNestedTest);
+
+assertJQ(req("q", "test_s:testing",
+"sort", "id asc",
+"fl", "*, [child 
childFilter='lonely/lonelyGrandChild/test2_s:secondTest' fl='*, _nest_path_']",
--- End diff --

Lets leave `_nest_path_` as internal and not mention it since testing "fl" 
has nothing to do with this field or nested documents.  So instead use some 
other field, foo_s or whatever.  And can you move this test to 
TestChildDocTransformer.java as it's not related to hierarchy?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #416: WIP: SOLR-12519

2018-08-28 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/416#discussion_r213541579
  
--- Diff: 
solr/core/src/test/org/apache/solr/response/transform/TestChildDocTransformerHierarchy.java
 ---
@@ -124,10 +124,11 @@ public void testParentFilterLimitJSON() throws 
Exception {
 
 assertJQ(req("q", "type_s:donut",
 "sort", "id asc",
-"fl", "id, type_s, toppings, _nest_path_, [child limit=1]",
+"fl", "id, type_s, lonely, lonelyGrandChild, test_s, test2_s, 
_nest_path_, [child limit=1]",
--- End diff --

To my point I wrote in JIRA:  It's sad that when I see this I have no idea 
if it's right/wrong without having to go look at indexSampleData then think 
about it.  No?  (this isn't a critique of you in particular; lots of tests 
including some I've written look like the current tests here).   imagine one 
doc with some nested docs, all of which only have their ID.  Since they only 
have their ID, it's not a lot of literal text in JSON.  The BeforeClass 
unmatched docs cold use negative IDs to easily know who's who.  Any way if you 
would rather leave this as a "TODO" for another day then I understand.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #416: WIP: SOLR-12519

2018-08-28 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/416#discussion_r213540255
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -123,6 +124,16 @@ public void transform(SolrDocument rootDoc, int 
rootDocId) {
 
 // Do we need to do anything with this doc (either ancestor or 
matched the child query)
 if (isAncestor || childDocSet == null || 
childDocSet.exists(docId)) {
+
+  if(limit != -1) {
+if(!isAncestor) {
+  if(matches == limit) {
+continue;
+  }
+  ++matches;
--- End diff --

I think matches should be incremented if it's in childDocSet (includes 
childDocSet being null).  Wether it's an ancestor or not doesn't matter I 
think.  You could pull out a new variable isInChildDocSet.  Or I suppose simply 
consider all a match; what I see what you just did as I write this; that's fine 
too.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #416: WIP: SOLR-12519

2018-08-28 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/416#discussion_r213295374
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -99,6 +96,9 @@ public void transform(SolrDocument rootDoc, int 
rootDocId) {
 
   // we'll need this soon...
   final SortedDocValues segPathDocValues = 
DocValues.getSorted(leafReaderContext.reader(), NEST_PATH_FIELD_NAME);
+  // passing a different SortedDocValues obj since the child documents 
which come after are of smaller docIDs,
+  // and the iterator can not be reversed.
+  final String transformedDocPath = getPathByDocId(segRootId, 
DocValues.getSorted(leafReaderContext.reader(), NEST_PATH_FIELD_NAME));
--- End diff --

Can you call this rootDocPath since we refer to this doc as the "root doc" 
elsewhere here?  I can see why you chose this name.  You could add a comment 
that the "root doc" is the input doc we are adding information to, and is 
usually but not necessarily the root of the block of documents (i.e. the root 
doc may itself be a child doc of another doc).


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #416: WIP: SOLR-12519

2018-08-28 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/416#discussion_r213291939
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.response.transform;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import com.google.common.collect.ArrayListMultimap;
+import com.google.common.collect.Multimap;
+import org.apache.lucene.index.DocValues;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.index.ReaderUtil;
+import org.apache.lucene.index.SortedDocValues;
+import org.apache.lucene.search.join.BitSetProducer;
+import org.apache.lucene.util.BitSet;
+import org.apache.solr.common.SolrDocument;
+import org.apache.solr.search.DocSet;
+import org.apache.solr.search.SolrDocumentFetcher;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.search.SolrReturnFields;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.solr.response.transform.ChildDocTransformerFactory.NUM_SEP_CHAR;
+import static 
org.apache.solr.response.transform.ChildDocTransformerFactory.PATH_SEP_CHAR;
+import static org.apache.solr.schema.IndexSchema.NEST_PATH_FIELD_NAME;
+
+class ChildDocTransformer extends DocTransformer {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private static final String ANON_CHILD_KEY = "_childDocuments_";
+
+  private final String name;
+  private final BitSetProducer parentsFilter;
+  private final DocSet childDocSet;
+  private final int limit;
+  private final boolean isNestedSchema;
+
+  private final SolrReturnFields childReturnFields = new 
SolrReturnFields();
+
+  ChildDocTransformer(String name, BitSetProducer parentsFilter,
+  DocSet childDocSet, boolean isNestedSchema, int 
limit) {
+this.name = name;
+this.parentsFilter = parentsFilter;
+this.childDocSet = childDocSet;
+this.limit = limit;
+this.isNestedSchema = isNestedSchema;
+  }
+
+  @Override
+  public String getName()  {
+return name;
+  }
+
+  @Override
+  public void transform(SolrDocument rootDoc, int rootDocId) {
+// note: this algorithm works if both if we have have _nest_path_  and 
also if we don't!
+
+try {
+
+  // lookup what the *previous* rootDocId is, and figure which segment 
this is
+  final SolrIndexSearcher searcher = context.getSearcher();
+  final List leaves = 
searcher.getIndexReader().leaves();
+  final int seg = ReaderUtil.subIndex(rootDocId, leaves);
+  final LeafReaderContext leafReaderContext = leaves.get(seg);
+  final int segBaseId = leafReaderContext.docBase;
+  final int segRootId = rootDocId - segBaseId;
+  final BitSet segParentsBitSet = 
parentsFilter.getBitSet(leafReaderContext);
+
+  final int segPrevRootId = segRootId==0? -1: 
segParentsBitSet.prevSetBit(segRootId - 1); // can return -1 and that's okay
+
+  if(segPrevRootId == (segRootId - 1)) {
+// doc has no children, return fast
+return;
+  }
+
+  // we'll need this soon...
+  final SortedDocValues segPathDocValues = 
DocValues.getSorted(leafReaderContext.reader(), NEST_PATH_FIELD_NAME);
+  // passing a different SortedDocValues obj since the child documents 
which come after are of smaller docIDs,
+  // and the iterator can not be reversed.
+  final String transformedDocPath = getPathByDocId(segRootId, 
DocValues.getSorted(leafReaderContext.

[GitHub] lucene-solr pull request #416: WIP: SOLR-12519

2018-08-28 Thread dsmiley
Github user dsmiley commented on a diff in the pull request:

https://github.com/apache/lucene-solr/pull/416#discussion_r213292329
  
--- Diff: 
solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java 
---
@@ -109,9 +109,14 @@ public void transform(SolrDocument rootDoc, int 
rootDocId) {
   // Loop each child ID up to the parent (exclusive).
   for (int docId = calcDocIdToIterateFrom(lastChildId, rootDocId); 
docId < rootDocId; ++docId) {
 
-// get the path.  (note will default to ANON_CHILD_KEY if not in 
schema or oddly blank)
+// get the path.  (note will default to ANON_CHILD_KEY if schema 
is not nested or empty string if blank)
 String fullDocPath = getPathByDocId(docId - segBaseId, 
segPathDocValues);
 
+if(isNestedSchema && !fullDocPath.contains(transformedDocPath)) {
+  // is not a descendant of the transformed doc, return fast.
+  return;
--- End diff --

woah; shouldn't this be "continue"?  We should have a test that would fail 
on this bug.  All it would take would be an additional child doc that is not 
underneath the input root/transformed doc but follows it (as provided on 
input).  Some of the first docs we iterate over here might not descend from 
rootDocId


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   3   4   5   6   >