n3nash commented on pull request #2374:
URL: https://github.com/apache/hudi/pull/2374#issuecomment-797797580


   @vinothchandar Addressed your feedback comments. Some more things to note:
   
   1. Added explicit dependency on `curator` to control the versions, earlier 
brought in through transitive dependency. NOTE that the latest stable version 
for curator is 4.X but we have to use curator version 2.X since 4.X is not 
backwards compatible with Zookeeper 2.x and 3.X while curator 2.X is backwards 
and forward compatible. The latest stable version of curator brings in many new 
features which are not applicable to us, so it should be OK to use 2.X
   2. Added Spark Datasource based nodes to HoodieTestSuite and added a test 
case to run through optimistic locking enabled with Spark datasource. NOTE that 
we can only test using `ZookeeperBasedLockProvider` for tests and cannot use 
`HiveMetastoreBasedLockProvider`. This is because embedded  hive metastore only 
allows one Hive client. When we use spark, spark's HiveExternalCatalog already 
creates an internal hive client, when hudi tries to create a new hive client 
through the `HiveMetastoreBasedLockProvider`, it throws underlying derby 
related exceptions. Some workarounds are to not use derby and are mentioned 
here -> https://issues.apache.org/jira/browse/HIVE-21302 but this requires a 
change to the underlying choice of DB for hive metastore itself. Hence, for 
now, we can only use ZK locks for in-memory unit tests. Ran it against 
production setup of HiveMetastore to validate this is only a unit test problem. 
   3. Added TestSimpleConcurrentFileWritesConflictResolutionStrategy to ensure 
the right candidate streams are always used. 
   4. Added another section to quickstart on how to enable optimistic locks. 
This PR -> https://github.com/apache/hudi/pull/2660 has the details. Validated 
this by turning on INFO logs on docker to confirm that locking works.
   
   Some warnings ( I will add these to the docs when I send the PR for 
documentation)
   1. `DeltaStreamer` cannot work with multi-writer. Currently we copy 
checkpoints from the previous commit to the next - this breaks when we perform 
multi-writing. 
   2. Configs for retries and timeouts for locking can vary because of many 
factors : response times of file system APIs, number of commits to check 
conflict resolution against etc. We need to document that folks have to choose 
sane timeouts and retries for their use-case if defaults don't work. 
   3. Incremental pull using TimelineAPI's will not work for clients. There 
will be a follow up PR for this in progress.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to