[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

Hoss Man (JIRA) Thu, 26 May 2016 19:37:21 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303367#comment-15303367
 ]


Hoss Man commented on SOLR-5944:
--------------------------------


Ok ... more in depth comments reviewing the latest patch (ignoring some of the 
general higher level stuff i've previously commented on).

(So far i've still focused on reviewing the tests, because we should make sure 
they're rock solid before any discussion of refacoting/improving/changing the 
code)

----

* in general, all these tests seem to depend on autoCommit being disabled, and 
use a config that is setup that way, but don't actaully assert that it's true 
in case someone changes the configs in the future
** TestInPlaceUpdate can get direct access to the SolrCore verify that for 
certain to
** the distrib tests might be able to use one of hte new cnfig APIs to check 
this (i don't know off the top of my head)
*** at a minimum define a String constant for the config file name in 
TestInPlaceUpdate and refer to it in the other tests where the same config is 
expected with a comment explaining that we're *assuming* it has autoCommit 
disabled and that TestInPlaceUpdate will fail if it does not.

* TestInPlaceUpdate
** SuppressCodecs should be removed
** should at least have class level javadocs explaining what's being tested
** testUpdatingDocValues
*** for addAndGetVersion calls where we don't care about the returned version, 
don't bother assigning it to a variable (distracting)
*** for addAndGetVersion calls where we do care about the returned version, we 
need check it for every update to that doc...
**** currently version1 is compared to newVersion1 to assert that an update 
incrememnted the version, but in between those 2 updates are 4 other places 
where that document was updated -- we have to assert it has the expected value 
(either the same as before, or new - and if new record it) after all of those 
addAndGetVersion calls, or we can't be sure where/why/how a bug exists if that 
existing comparison fails.
**** ideally we should be asserting the version of every doc when we query it 
right along side the assertion for it's updated "ratings" value
*** most of the use of "field(ratings)" can probbaly just be replaced with 
"ratings" now that DV are returnable -- although it's nice to have both 
included in the test at least once to demo that both work, but when doing that 
there should be a comment making it clear why
** testOnlyPartialUpdatesBetweenCommits
*** ditto comment about checking return value from addAndGetVersion
*** this also seems like a good place to to test if doing a redundent atomic 
update (either via set to the same value or via inc=0) returns a new version or 
not -- should it?
** DocInfo should be a private static class and have some javadocs
** based on how testing has gone so far, and the discover of LUCENE-7301 it 
seems clear that adding even single thread, single node, randomized testing of 
lots of diff types of add/update calls would be good to add
*** we could refactor/improve the "checkReplay" function I added in the last 
patch to do more testing of a randomly generated Iterable of "commands" 
(commit, doc, doc+atomic mutation, etc...)
*** and of course: improve checkReplay to verify RTG against hte uncommited 
model as well
*** testReplayFromFile and getSDdoc should probably go away once we have more 
structured tests for doing this
** createMap can be elimianted -- callers can just use SolrTestCaseJ4.map(...)
** In general the tests in this class should include more queries / sorting 
against the updated docvalues field after commits to ensure that the updated 
value is searchable & sortable
** Likewise the test methods in this class should probably have a lot more RTG 
checks -- with filter queries that constrain against the updated docvalues 
field, and checks of the expected version field -- to ensure that is all 
working properly.

* InPlaceUpdateDistribTest
** SuppressCodecs should be removed
** should at least have class level javadocs explaining what's being tested
** Once LUCENE-7301 is fixed and we can demonstate that this passes reliably 
all of the time, we should ideally refactor this to subclass SolrCloudTestCase
** in general, the "pick a random client" logic should be refactored so that 
sometimes it randomly picks a CloudSolrClient
** there should almost certianly be some "delete all docs and optimize" cleanup 
in between all of these tests
*** easy to do in an @Before method if we refactor to subclass SolrCloudTestCase
** docValuesUpdateTest
*** should randomize numdocs
*** we need to find away to eliminate the hardcoded "Thread.sleep(500);" 
calls...
**** if initially no docs have a rating value, then make the (first) test query 
be for {{rating:\[\* TO \*\]}} and execute it in a rety loop until the numFound 
matches numDocs.
**** likewise if we ensure all ratings have a value such that abs(ratings) < X, 
then the second update can use an increment such that abs(inc) > X\*3 and we 
can use {{-ratings:\[-X TO X\]}} as the query in a retry loop
** ensureRtgWorksWithPartialUpdatesTest
*** even if we're only going to test one doc, we should ensure there are a 
random num docs in the index (some before the doc we're editing, and some after)
*** if we're testing RTG, then we should be testing the version returned from 
every /get call against the last version returned from every update
** outOfOrderUpdatesIndividualReplicaTest
*** ditto comments about only one doc
*** ditto comments about testing the expected version in RTG requests
*** if we're sending updates direct to replicas to test how they handle out of 
order updates, then something better assert exactly where the leader is hosted 
and ensure we don't send to it by mistake
*** what's the point of using a threadpool and SendUpdateToReplicaTask here?  
why not just send the updates in a (randdomly assigned) determinisitc order?
**** if we are going to use an ExecutorService, then the result of 
awaitTermination has to be checked
**** ... and shutdown & awaitTermination have to be called in that order
*** since this tests puts replicas out of sync, a "delete all docs and 
optimize" followed up "wait for recovers" should happen at the end of this test 
(or just in between every test) .. especially if we refactor it (or to protect 
someone in the future who might refactor it)
** delayedReorderingFetchesMissingUpdateFromLeaderTest
*** ditto previous comments about using a threadpool and SendUpdateToReplicaTask
**** even more so considering the "Thread.sleep(100);" ... what's the pont of 
using a threadpool if we want the requests to be sequential?
*** Is there no way we can programatically tell if LIR has kicked in? ... 
pehaps by setting a ZK watch? ... this "Thread.sleep(500);" is no garuntee and 
seens arbitrary.
**** at a minimum polling in a loop for the expected results seems better then 
just a hardcoded sleep
** SendUpdateToReplicaTask
*** based on how it's used, i'm not really sure i see the point in this class, 
but assuming it continues to exist...
*** constructor takes in a Random, but method uses that global {{random()}} 
anyway.
**** should probably take in a seed, and construct it's own Random
**** {{random()}} ensures that each Thread gets it's own consistent Random 
instance -- but in Callables like this each Thread having a consistent seed 
doesn't help the reproducibility since there's no garutee which Threed from an 
ThreadPool (Executor) will invoke call().
*** instead of returning true, this should be a Callable<UpdateResponse> and 
call() should return the results of the request so the caller can assert it was 
successful (via Future.get().getStatus())
** getReplicaValue
*** using SolrTestCaseJ4.params(...) would make this method a lot shorter
*** based on where/how this method is used, i don't understand why it returns 
String instead of just Object
** assertReplicaValue
*** should take in some sort of assertion message and/or build/append an 
assertion message using the clientId
*** if getReplicaValue returns an Object, this can take an "Object expected" 
param and eliminate abonch of toString & string concating throughout the test
** simulatedUpdateRequest
*** if this method is going to assume that the only shard you ever want to 
similulate an update to is SHARD1 then the method name should be 
"simulatedUpdateRequestToShard1Replica"
*** better still - why not ask the DocRouter which shard this doc belongs in, 
and fetch the leader URL tha way?
** most usage of "addFields" can just be replaced with a call to "sdoc(...)" to 
simplify code
** replace createMap usage with SolrTestCaseJ4.map
** why override tearDown if we're just calling super?
** in general, i think this test would be a lot easier to read if there were 
well named variables for HttpSolrClient instances pointed at specific replicas 
(ie HttpSolrClient SHARD1_LEADER = ...; HttpSolrClient SHARD1_REPLICA1 = ...; 
etc...) and passed those around to the various methods instead of magic ints 
(ie: "1", "2") to refer to which index in the static clients list should be 
used for a given update.

* TestStressInPlaceUpdates
** ditto comments from InPlaceUpdateDistribTest about regarding @SupressCodecs, 
javadocs, and extending SolrCloudTestCase once LUCENE-7301 is fixed and we're 
sure this test passes reliably.
*** also: we should really make this test use multiple shards
** stressTest
*** it would be a lot cleaner/clearer if we refactored these anonymous Thread 
classes into private static final (inner) classes and instantiated them like 
normal objects
**** makes it a lot easier to see what threads access/share what state
**** better still would be implementing these "workers" as Callable instances 
and using an ExecutorService
*** "operations" comment is bogus (it's not just for queries)
*** maxConcurrentCommits WTF?
**** has a comment that it should be less the # warming threads, but does that 
even make sense i na distrib test?
**** currently set to nWriteThreads -- so what's the point, when/how could we 
ever possibly have more concurrent commits then the number of threads? doesn't 
that just mean that at the moment every write thread is allowed to commit if it 
wants to?
**** if there is a reason for it, then replaceing "numCommitting" with a {{new 
Semaphore(maxConcurrentCommits)}} would probably make more sense
*** why is the "hardCommit start" logic calling both {{commit();}} and 
{{clients.get(chosenClientIndex).commit();}} ?
*** I'm not convinced the "{{synchronize \{...\}; commit stuff; syncrhonize \{ 
... \};}}" sequence is actually thread safe...
**** T-W1: commit sync block 1: newCommittedModel = copy(model), version = 
snapshotCount++;
**** T-W2: updates a doc and adds it to model
**** T-W1: commit
**** T-W1: commit sync block 2: committedModel = newCommittedModel
**** T-R3: read sync block: get info from committedModel
**** T-R3: query for doc
**** ...
*** ... in the above sequence, query results seen by thread T-R3 won't match 
the model because the update from T-W2 made it into the index before the 
commit, but after the model was copied
**** i guess it's not a huge problem because the query thread doesn't bother to 
assert anything unless the versions match -- but that seems kind of risky ... 
we could theoretically never assert anything
*** having at least one pass over the model checking every doc at the end of 
the test seems like a good idea no matter what
*** I'm certain the existing "synchronized (model)" block is not thread safe 
relative to the synchronized blocks that copy the model into commitedModel, 
because the "model.put(...)" calls can change the iterator and trigger a 
ConcurrentModificationException
*** there's a bunch of "TODO" blocks realted to deletes that still need 
implemented
*** the writer threads should construct the SolrInputDocument themselves, and 
log the whole document (not just the id) when they log things, so it's easier 
to tell from the logs what updates succeed and which were rejected because of 
version conflicts
*** why is {{//fail("Were were results: "+response);}} commented out?
*** there's a lot of "instanceof ArrayList" checks that make no sense to me 
since the object came from getFirstValue
** DocInfo
*** should be a private static class and have some javadocs
*** or sould be a public class in it's own file, w/javadocs, and re-used in the 
various tests that want ot reuse it
** verbose
*** why does this method exist? why aren't callers just using log.info(...) 
directly?
*** or if callers really need to pass big sequences of stuff, they can use 
{{log.info("\{\}", Arrays.asList(...))}}
*** or worst case: this method can simplified greatly to do that internally
** addDocAndGetVersion
*** using SolrTestCaseJ4.sdoc and SolrTestCaseJ4.params will make this method a 
lot sorder
*** why are we synchronizing on cloudClient but updating with leaderClient?
**** if the point is just to ensure all udpates happem synchronously regardless 
of client, then we should just define some {{public static final Object 
UPDATE_SYNC = new Object("sync lock for updates");}} and use that
** getClientForLeader
*** i know this method is currently just a workaround for SOLR-8733, noting 
that in the method javadocs seems important
*** if we refactor this test to use multiple shards before SOLR-8733 gets 
resolved, this method can take in a uniqueKey, and consult the DocRouter to 
pick the correct shard/node.
** replace createMap usage with SolrTestCaseJ4.map
** why override tearDown if we're just calling super?


* SolrDocument
** applyOlderUpdate
*** as mentioned before, i don't think this method is correct when it comes to 
multivalued fields, and should have more unit tests of various permutations to 
be sure
*** this functionality should probably be moved to a private helper method in 
UpdateLog (it doesn't do anything that requires it have access to the internals 
of the SolrDocument)
*** no matter where it lives, it should have some javadocs


> Support updates of numeric DocValues
> ------------------------------------
>
>                 Key: SOLR-5944
>                 URL: https://issues.apache.org/jira/browse/SOLR-5944
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Shalin Shekhar Mangar
>         Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

Reply via email to