date:20210403

[GitHub] [hudi] Sugamber commented on issue #2637: [SUPPORT] - Partial Update : update few columns of a table

2021-04-03 Thread GitBox



Sugamber commented on issue #2637:
URL: https://github.com/apache/hudi/issues/2637#issuecomment-812973626


   @n3nash  I'm waiting for below pull request to merge. Please let me know if 
we can close once pull request merged.
   https://github.com/apache/hudi/pull/2666


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] nsivabalan commented on pull request #2762: [HUDI-1716] Regenerating schema with default null values as applicable

2021-04-03 Thread GitBox



nsivabalan commented on pull request #2762:
URL: https://github.com/apache/hudi/pull/2762#issuecomment-812969426


   @bvaradar : I see that SchemaConverters.toAvroType(...) is used in 3 places 
in hudi code base. 2 of those have been fixed as part of this patch. Wondering 
if we need to fix HoodieSparkBootstrapSchemaProvider as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (HUDI-1716) rt view w/ MOR tables fails after schema evolution

2021-04-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1716:
-
Labels: pull-request-available sev:critical user-support-issues  (was: 
sev:critical user-support-issues)

> rt view w/ MOR tables fails after schema evolution
> --
>
> Key: HUDI-1716
> URL: https://issues.apache.org/jira/browse/HUDI-1716
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Storage Management
>Reporter: sivabalan narayanan
>Assignee: Aditya Tiwari
>Priority: Major
>  Labels: pull-request-available, sev:critical, user-support-issues
> Fix For: 0.9.0
>
>
> Looks like realtime view w/ MOR table fails if schema present in existing log 
> file is evolved to add a new field. no issues w/ writing. but reading fails
> More info: [https://github.com/apache/hudi/issues/2675]
>  
> gist of the stack trace:
> Caused by: org.apache.avro.AvroTypeException: Found 
> hoodie.hudi_trips_cow.hudi_trips_cow_record, expecting 
> hoodie.hudi_trips_cow.hudi_trips_cow_record, missing required field 
> evolvedFieldCaused by: org.apache.avro.AvroTypeException: Found 
> hoodie.hudi_trips_cow.hudi_trips_cow_record, expecting 
> hoodie.hudi_trips_cow.hudi_trips_cow_record, missing required field 
> evolvedField at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) at 
> org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at 
> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:130) 
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:215)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) 
> at 
> org.apache.hudi.common.table.log.block.HoodieAvroDataBlock.deserializeRecords(HoodieAvroDataBlock.java:165)
>  at 
> org.apache.hudi.common.table.log.block.HoodieDataBlock.createRecordsFromContentBytes(HoodieDataBlock.java:128)
>  at 
> org.apache.hudi.common.table.log.block.HoodieDataBlock.getRecords(HoodieDataBlock.java:106)
>  at 
> org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processDataBlock(AbstractHoodieLogRecordScanner.java:289)
>  at 
> org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:324)
>  at 
> org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:252)
>  ... 24 more21/03/25 11:27:03 WARN TaskSetManager: Lost task 0.0 in stage 
> 83.0 (TID 667, sivabala-c02xg219jgh6.attlocal.net, executor driver): 
> org.apache.hudi.exception.HoodieException: Exception when reading log file  
> at 
> org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:261)
>  at 
> org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:100)
>  at 
> org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:93)
>  at 
> org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:75)
>  at 
> org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:230)
>  at 
> org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:328) 
> at 
> org.apache.hudi.HoodieMergeOnReadRDD$$anon$3.(HoodieMergeOnReadRDD.scala:210)
>  at 
> org.apache.hudi.HoodieMergeOnReadRDD.payloadCombineFileIterator(HoodieMergeOnReadRDD.scala:200)
>  at 
> org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:77)
>  
> Logs from local run: 
> [https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198]
> diff with which above logs were generated: 
> [https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec]
>  
> Steps to reproduce in spark shell:
>  # create MOR table w/ schema1. 
>  # Ingest (with schema1) until log files are created. // verify via hudi-cli. 
> It took me 2 batch of updates to see a log file.
>  # create a new schema2 with one new additional field. ingest a batch with 
> schema2 that updates existing records. 
>  # read entire dataset. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [hudi] nsivabalan opened a new pull request #2762: [HUDI-1716] Regenerating schema with default null values as applicable

2021-04-03 Thread GitBox



nsivabalan opened a new pull request #2762:
URL: https://github.com/apache/hudi/pull/2762


   ## What is the purpose of the pull request
   
   For Union schema types, avro expects "null" to be first entry for null to be 
considered as default value. But since Hudi leverages 
SchemaConverters.toAvroType(...) org.apache.spark:spark-avro_* library, 
structType to avro results in "null" being 2nd entry for UNION type schemas. 
Also, there is no default value set in this avro schema thus generated. This 
patch fixes this issue. 
   
   ## Brief change log
   - Fixing struct type to avro schema conversion to fix null as first entry in 
UNION schema types and add defaults values for the same. 
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
   
 - Added tests to HoodieSparkSqlWriterSuite // This test fails if not for 
the fix in source code for MOR table and succeeds w/ the fix. 
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] ssdong commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-04-03 Thread GitBox



ssdong commented on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-812964282


   btw, @jsbali, are you working on both of these tickets? 
   https://issues.apache.org/jira/browse/HUDI-1739
   https://issues.apache.org/jira/browse/HUDI-1740
   Can I pick up one as I would be happy to contribute?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-03 Thread GitBox



rubenssoto commented on issue #2508:
URL: https://github.com/apache/hudi/issues/2508#issuecomment-812936268


   Now Im testing with Hudi 0.8.0-rc1
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-03 Thread GitBox



rubenssoto commented on issue #2508:
URL: https://github.com/apache/hudi/issues/2508#issuecomment-812936246


   When I dont use row writer enable on bulk insert I have no problems.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-03 Thread GitBox



rubenssoto commented on issue #2508:
URL: https://github.com/apache/hudi/issues/2508#issuecomment-812936183


   Hello,
   
   Sorry for my very late response, but I tried again and I have problems with 
only one table, same table, same error:
   
   21/04/03 22:09:32 WARN TaskSetManager: Lost task 0.0 in stage 31.0 (TID 
1025, ip-10-0-49-182.us-west-2.compute.internal, executor 2): 
org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType 
UPDATE for partition :0
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:288)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$execute$ecf5068c$1(BaseSparkCommitActionExecutor.java:139)
at 
org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
at 
org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:889)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:889)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:362)
at 
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1388)
at 
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1298)
at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1362)
at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1186)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:360)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:311)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:349)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:313)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: operation has failed
at 
org.apache.hudi.table.action.commit.SparkMergeHelper.runMerge(SparkMergeHelper.java:102)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpdateInternal(BaseSparkCommitActionExecutor.java:317)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpdate(BaseSparkCommitActionExecutor.java:308)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:281)
... 28 more
   Caused by: org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: operation has failed
at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:143)
at 
org.apache.hudi.table.action.commit.SparkMergeHelper.runMerge(SparkMergeHelper.java:100)
... 31 more
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.hudi.exception.HoodieException: operation has failed
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:141)
... 32 more
   Caused by: org.apache.hudi.exception.HoodieException: operation has failed
at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueue.throwExceptionIfFailed(BoundedInMemoryQueue.java:247)
at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueue.readNextRecord(BoundedInMemoryQueue.java:226)
at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueue.access$100(BoundedInMemoryQueue.java:52)
at

[GitHub] [hudi] rubenssoto commented on pull request #2627: [HUDI-1653] Add support for composite keys in NonpartitionedKeyGenerator

2021-04-03 Thread GitBox



rubenssoto commented on pull request #2627:
URL: https://github.com/apache/hudi/pull/2627#issuecomment-812904015


   Hello Guys,
   
   NonPartitionedExtractor of hudi sync will work too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] pratyakshsharma commented on issue #2760: [SUPPORT] Possibly Incorrect Documentation

2021-04-03 Thread GitBox



pratyakshsharma commented on issue #2760:
URL: https://github.com/apache/hudi/issues/2760#issuecomment-812891514


   Filed a jira here for the same - 
https://issues.apache.org/jira/browse/HUDI-1760.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (HUDI-1760) Incorrect Documentation for HoodieWriteConfigs

2021-04-03 Thread Pratyaksh Sharma (Jira)

Pratyaksh Sharma created HUDI-1760:
--

 Summary: Incorrect Documentation for HoodieWriteConfigs
 Key: HUDI-1760
 URL: https://issues.apache.org/jira/browse/HUDI-1760
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Pratyaksh Sharma


GH Issue - https://github.com/apache/hudi/issues/2760



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HUDI-1741) Row Level TTL Support for records stored in Hudi

2021-04-03 Thread Pratyaksh Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HUDI-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314297#comment-17314297
 ] 

Pratyaksh Sharma commented on HUDI-1741:


Guess the same can be handled with this Jira - 
https://issues.apache.org/jira/browse/HUDI-349? [~vbalaji] [~shivnarayan]

> Row Level TTL Support for records stored in Hudi
> 
>
> Key: HUDI-1741
> URL: https://issues.apache.org/jira/browse/HUDI-1741
> Project: Apache Hudi
>  Issue Type: New Feature
>  Components: Utilities
>Reporter: Balaji Varadarajan
>Priority: Major
>
> For e:g : Have records only updated last month 
>  
> GH: https://github.com/apache/hudi/issues/2743



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [hudi] pratyakshsharma removed a comment on pull request #2744: [HUDI-1742]improve table level config priority

2021-04-03 Thread GitBox



pratyakshsharma removed a comment on pull request #2744:
URL: https://github.com/apache/hudi/pull/2744#issuecomment-812887127


   Please add a small test case covering this scenario. :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] pratyakshsharma commented on pull request #2744: [HUDI-1742]improve table level config priority

2021-04-03 Thread GitBox



pratyakshsharma commented on pull request #2744:
URL: https://github.com/apache/hudi/pull/2744#issuecomment-812887127


   Please add a small test case covering this scenario. :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (HUDI-1742) improve table level config priority in HoodieMultiTableDeltaStreamer

2021-04-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HUDI-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-1742:
-
Labels: pull-request-available  (was: )

> improve table level config priority in HoodieMultiTableDeltaStreamer
> 
>
> Key: HUDI-1742
> URL: https://issues.apache.org/jira/browse/HUDI-1742
> Project: Apache Hudi
>  Issue Type: Wish
>  Components: DeltaStreamer
>Reporter: NickYoung
>Priority: Major
>  Labels: pull-request-available
>
> I hope that when the table-level configuration file and the public l 
> configuration file have the same configuration, the table-level configuration 
> file configuration is used。
> But now if the table-level configuration file and the public configuration 
> file have the same configuration, the configuration in the public 
> configuration file will be adopted。
> https://hudi.apache.org/blog/ingest-multiple-tables-using-hudi/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2744: [HUDI-1742]improve table level config priority

2021-04-03 Thread GitBox



pratyakshsharma commented on a change in pull request #2744:
URL: https://github.com/apache/hudi/pull/2744#discussion_r606681411



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##
@@ -117,7 +117,9 @@ private void 
populateTableExecutionContextList(TypedProperties properties, Strin
   checkIfTableConfigFileExists(configFolder, fs, configFilePath);
   TypedProperties tableProperties = UtilHelpers.readConfig(fs, new 
Path(configFilePath), new ArrayList<>()).getConfig();
   properties.forEach((k, v) -> {
-tableProperties.setProperty(k.toString(), v.toString());
+if (tableProperties.get(k) == null) {

Review comment:
   Can you add a check for null and empty value as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] ssdong edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-04-03 Thread GitBox



ssdong edited a comment on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-812856042


   @satishkotha Thanks for the tips! In fact, I was also able to reproduce this 
issue locally on my machine regarding the `insert_overwrite_table` issue that 
@jsbali has raised tickets https://issues.apache.org/jira/browse/HUDI-1739 and 
https://issues.apache.org/jira/browse/HUDI-1740 against with.
   
   For tracking and maybe helping you guys test the fix(in the future), I am 
pasting the script I used for reproducing the issue here.  
   ```
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   import java.util.UUID
   import java.sql.Timestamp
   
   val tableName = "hudi_date_mor"
   val basePath = "" < fill out this 
value to point to your local folder(in absolute path)
   val writeConfigs = Map(
"hoodie.cleaner.incremental.mode" -> "true",
"hoodie.insert.shuffle.parallelism" -> "20",
"hoodie.upsert.shuffle.parallelism" -> "2",
"hoodie.clean.automatic" -> "false",
"hoodie.datasource.write.operation" -> "insert_overwrite_table",
"hoodie.table.name" -> tableName,
"hoodie.datasource.write.table.type" -> "MERGE_ON_READ",
"hoodie.cleaner.policy" -> "KEEP_LATEST_FILE_VERSIONS",
"hoodie.keep.max.commits" -> "3",
"hoodie.cleaner.commits.retained" -> "1",
"hoodie.keep.min.commits" -> "2",
"hoodie.compact.inline.max.delta.commits" -> "1"
)
   
   val dateSMap: Map[Int, String] = Map(
   0-> "2020-07-01",
   1-> "2020-08-01",
   2-> "2020-09-01",
   )
   val dateMap: Map[Int, Timestamp] = Map(
   0-> Timestamp.valueOf("2010-07-01 11:00:15"),
   1-> Timestamp.valueOf("2010-08-01 11:00:15"),
   2-> Timestamp.valueOf("2010-09-01 11:00:15"),
   )
   var seq = Seq(
   (0, "value", dateMap(0), dateSMap(0), UUID.randomUUID.toString)
   )
   for(i <- 501 to 1000) {
   seq :+= (i, "value", dateMap(i % 3), dateSMap(i % 3), 
UUID.randomUUID.toString)
   }
   val df = seq.toDF("id", "string_column", "timestamp_column", "date_string", 
"uuid")
   ```
   
   Run the spark shell(the one taken from hudi quick start page and I am using 
spark version `spark-3.0.1-bin-hadoop2.7`):
   ```
   ./spark-shell --packages 
org.apache.hudi:hudi-spark-bundle_2.12:0.7.0,org.apache.spark:spark-avro_2.12:3.0.1
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   ```
   Copy the above script in there and hit 
`df.write.format("hudi").options(writeConfigs).mode(Overwrite).save(basePath)` 
4 times and on the fifth time, it throws the anticipated
   ```
   Caused by: java.lang.IllegalArgumentException: Positive number of partitions 
required` issue.
   ```
   
   Now, the thing is, we _still_ have to _manually_ delete the first commit 
file, which contains the empty `partitionToReplaceFileIds`; otherwise, it would 
still keep throwing the `Positive number of partitions required issue` error.
   The `"hoodie.embed.timeline.server" -> "false"` _does_ help as it forces the 
write to refresh its timeline so we wouldn't see the second error again, which 
is 
   ```
   java.io.FileNotFoundException: 
/.hoodie/20210403201659.replacecommit does not exist
   ```
   However, it appears `"hoodie.embed.timeline.server" -> "false"` to be not 
_quite_ necessary since the _6th_ time we write, the writer is automatically 
being refreshed with the _newest_ timeline and it will put all `*replacecommit` 
files back to a status of integrity again. 
   
   If we fix the empty `partitionToReplaceFileIds` issue, we might not need to 
dig into the `replacecommit does not exist` issue anymore since it is caused by 
the workaround of _manually_ deleting the empty commit file. It would fix 
everything from the start. However, I would still be curious to learn about 
_why_ we would need a `reset` of the timeline server within the `close` action 
upon the `HoodieTableFileSystemView`. It appears unnecessary to me and could be 
removed if there is no strong reason behind it. 
   
   The `reset` within `close` was originally introduced in #600 after a bit of 
digging in that code. I hope that helps you narrow down the scope a little bit. 
Maybe @bvaradar could explain it if the memory is still fresh to you since that 
PR is about 2 years ago from now.  Thanks.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] ssdong edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-04-03 Thread GitBox



ssdong edited a comment on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-812856042


   @satishkotha Thanks for the tips! In fact, I was also able to reproduce this 
issue locally on my machine regarding the `insert_overwrite_table` issue that 
@jsbali has raised tickets https://issues.apache.org/jira/browse/HUDI-1739 and 
https://issues.apache.org/jira/browse/HUDI-1740 against with.
   
   For tracking and maybe helping you guys test the fix(in the future), I am 
pasting the script I used for reproducing the issue here.  
   ```
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   import java.util.UUID
   import java.sql.Timestamp
   
   val tableName = "hudi_date_mor"
   val basePath = "" < fill out this 
value to point to your local folder(in absolute path)
   val writeConfigs = Map(
"hoodie.cleaner.incremental.mode" -> "true",
"hoodie.insert.shuffle.parallelism" -> "20",
"hoodie.upsert.shuffle.parallelism" -> "2",
"hoodie.clean.automatic" -> "false",
"hoodie.datasource.write.operation" -> "insert_overwrite_table",
"hoodie.table.name" -> tableName,
"hoodie.datasource.write.table.type" -> "MERGE_ON_READ",
"hoodie.cleaner.policy" -> "KEEP_LATEST_FILE_VERSIONS",
"hoodie.keep.max.commits" -> "3",
"hoodie.cleaner.commits.retained" -> "1",
"hoodie.keep.min.commits" -> "2",
"hoodie.compact.inline.max.delta.commits" -> "1"
)
   
   val dateSMap: Map[Int, String] = Map(
   0-> "2020-07-01",
   1-> "2020-08-01",
   2-> "2020-09-01",
   )
   val dateMap: Map[Int, Timestamp] = Map(
   0-> Timestamp.valueOf("2010-07-01 11:00:15"),
   1-> Timestamp.valueOf("2010-08-01 11:00:15"),
   2-> Timestamp.valueOf("2010-09-01 11:00:15"),
   )
   var seq = Seq(
   (0, "value", dateMap(0), dateSMap(0), UUID.randomUUID.toString)
   )
   for(i <- 501 to 1000) {
   seq :+= (i, "value", dateMap(i % 3), dateSMap(i % 3), 
UUID.randomUUID.toString)
   }
   val df = seq.toDF("id", "string_column", "timestamp_column", "date_string", 
"uuid")
   ```
   
   Run the spark shell(the one taken from hudi quick start page and I am using 
spark version `spark-3.0.1-bin-hadoop2.7`):
   ```
   ./spark-shell --packages 
org.apache.hudi:hudi-spark-bundle_2.12:0.7.0,org.apache.spark:spark-avro_2.12:3.0.1
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   ```
   Copy the above script in there and hit 
`df.write.format("hudi").options(writeConfigs).mode(Overwrite).save(basePath)` 
4 times and on the fifth time, it throws the anticipated
   ```
   Caused by: java.lang.IllegalArgumentException: Positive number of partitions 
required` issue.
   ```
   
   Now, the thing is, we _still_ have to _manually_ delete the first commit 
file, which contains the empty `partitionToReplaceFileIds`; otherwise, it would 
still keep throwing the `Positive number of partitions required issue. error.`
   The `"hoodie.embed.timeline.server" -> "false"` _does_ help as it forces the 
write to refresh its timeline so we wouldn't see the second error again, which 
is 
   ```
   java.io.FileNotFoundException: 
/.hoodie/20210403201659.replacecommit does not exist
   ```
   However, it appears `"hoodie.embed.timeline.server" -> "false"` to be not 
_quite_ necessary since the _6th_ time we write, the writer is automatically 
being refreshed with the _newest_ timeline and it will put all `*replacecommit` 
files back to a status of integrity again. 
   
   If we fix the empty `partitionToReplaceFileIds` issue, we might not need to 
dig into the `replacecommit does not exist` issue anymore since it is caused by 
the workaround of _manually_ deleting the empty commit file. It would fix 
everything from the start. However, I would still be curious to learn about 
_why_ we would need a `reset` of the timeline server within the `close` action 
upon the `HoodieTableFileSystemView`. It appears unnecessary to me and could be 
removed if there is no strong reason behind it. 
   
   The `reset` within `close` was originally introduced in #600 after a bit of 
digging in that code. I hope that helps you narrow down the scope a little bit. 
Maybe @bvaradar could explain it if the memory is still fresh to you since that 
PR is about 2 years ago from now.  Thanks.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] ssdong edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-04-03 Thread GitBox



ssdong edited a comment on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-812856042


   @satishkotha Thanks for the tips! In fact, I was also able to reproduce this 
issue locally on my machine regarding the `insert_overwrite_table` issue that 
@jsbali has raised tickets https://issues.apache.org/jira/browse/HUDI-1739 and 
https://issues.apache.org/jira/browse/HUDI-1740 against with.
   
   For tracking and maybe helping you guys test the fix(in the future), I am 
pasting the script I used for reproducing the issue here.  
   ```
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   import java.util.UUID
   import java.sql.Timestamp
   
   val tableName = "hudi_date_mor"
   val basePath = "" < fill out this 
value to point to your local folder(in absolute path)
   val writeConfigs = Map(
"hoodie.cleaner.incremental.mode" -> "true",
"hoodie.insert.shuffle.parallelism" -> "20",
"hoodie.upsert.shuffle.parallelism" -> "2",
"hoodie.clean.automatic" -> "false",
"hoodie.datasource.write.operation" -> "insert_overwrite_table",
"hoodie.table.name" -> tableName,
"hoodie.datasource.write.table.type" -> "MERGE_ON_READ",
"hoodie.cleaner.policy" -> "KEEP_LATEST_FILE_VERSIONS",
"hoodie.keep.max.commits" -> "3",
"hoodie.cleaner.commits.retained" -> "1",
"hoodie.keep.min.commits" -> "2",
"hoodie.compact.inline.max.delta.commits" -> "1"
)
   
   val dateSMap: Map[Int, String] = Map(
   0-> "2020-07-01",
   1-> "2020-08-01",
   2-> "2020-09-01",
   )
   val dateMap: Map[Int, Timestamp] = Map(
   0-> Timestamp.valueOf("2010-07-01 11:00:15"),
   1-> Timestamp.valueOf("2010-08-01 11:00:15"),
   2-> Timestamp.valueOf("2010-09-01 11:00:15"),
   )
   var seq = Seq(
   (0, "value", dateMap(0), dateSMap(0), UUID.randomUUID.toString)
   )
   for(i <- 501 to 1000) {
   seq :+= (i, "value", dateMap(i % 3), dateSMap(i % 3), 
UUID.randomUUID.toString)
   }
   val df = seq.toDF("id", "string_column", "timestamp_column", "date_string", 
"uuid")
   ```
   
   Run the spark shell(the one taken from hudi quick start page and I am using 
spark version `spark-3.0.1-bin-hadoop2.7`):
   ```
   ./spark-shell --packages 
org.apache.hudi:hudi-spark-bundle_2.12:0.7.0,org.apache.spark:spark-avro_2.12:3.0.1
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   ```
   Copy the above script in there and hit 
`df.write.format("hudi").options(writeConfigs).mode(Overwrite).save(basePath)` 
4 times and on the fifth time, it throws the anticipated `Caused by: 
java.lang.IllegalArgumentException: Positive number of partitions required` 
issue.
   
   Now, the thing is, we _still_ have to _manually_ delete the first commit 
file, which contains the empty `partitionToReplaceFileIds`; otherwise, it would 
still keep throwing the `Positive number of partitions required issue. error.`
   The `"hoodie.embed.timeline.server" -> "false"` _does_ help as it forces the 
write to refresh its timeline so we wouldn't see the second error again, which 
is 
   ```
   java.io.FileNotFoundException: 
/.hoodie/20210403201659.replacecommit does not exist
   ```
   However, it appears `"hoodie.embed.timeline.server" -> "false"` to be not 
_quite_ necessary since the _6th_ time we write, the writer is automatically 
being refreshed with the _newest_ timeline and it will put all `*replacecommit` 
files back to a status of integrity again. 
   
   If we fix the empty `partitionToReplaceFileIds` issue, we might not need to 
dig into the `replacecommit does not exist` issue anymore since it is caused by 
the workaround of _manually_ deleting the empty commit file. It would fix 
everything from the start. However, I would still be curious to learn about 
_why_ we would need a `reset` of the timeline server within the `close` action 
upon the `HoodieTableFileSystemView`. It appears unnecessary to me and could be 
removed if there is no strong reason behind it. 
   
   The `reset` within `close` was originally introduced in #600 after a bit of 
digging in that code. I hope that helps you narrow down the scope a little bit. 
Maybe @bvaradar could explain it if the memory is still fresh to you since that 
PR is about 2 years ago from now.  Thanks.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] ssdong commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

2021-04-03 Thread GitBox



ssdong commented on issue #2707:
URL: https://github.com/apache/hudi/issues/2707#issuecomment-812856042


   @satishkotha Thanks for the tips! In fact, I was also able to reproduce this 
issue locally on my machine regarding the `insert_overwrite_table` issue that 
@jsbali has raised tickets https://issues.apache.org/jira/browse/HUDI-1739 and 
https://issues.apache.org/jira/browse/HUDI-1740 against with.
   
   For tracking and maybe helping you guys test the fix(in the future), I am 
pasting the script I used for reproducing the issue here.  
   ```
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   import java.util.UUID
   import java.sql.Timestamp
   
   val tableName = "hudi_date_mor"
   val basePath = "" < fill out this 
value to point to your local folder(in absolute path)
   val writeConfigs = Map(
"hoodie.cleaner.incremental.mode" -> "true",
"hoodie.insert.shuffle.parallelism" -> "20",
"hoodie.upsert.shuffle.parallelism" -> "2",
"hoodie.clean.automatic" -> "false",
"hoodie.datasource.write.operation" -> "insert_overwrite_table",
"hoodie.table.name" -> tableName,
"hoodie.datasource.write.table.type" -> "MERGE_ON_READ",
"hoodie.cleaner.policy" -> "KEEP_LATEST_FILE_VERSIONS",
"hoodie.keep.max.commits" -> "3",
"hoodie.cleaner.commits.retained" -> "1",
"hoodie.keep.min.commits" -> "2",
"hoodie.compact.inline.max.delta.commits" -> "1"
)
   
   val dateSMap: Map[Int, String] = Map(
   0-> "2020-07-01",
   1-> "2020-08-01",
   2-> "2020-09-01",
   )
   val dateMap: Map[Int, Timestamp] = Map(
   0-> Timestamp.valueOf("2010-07-01 11:00:15"),
   1-> Timestamp.valueOf("2010-08-01 11:00:15"),
   2-> Timestamp.valueOf("2010-09-01 11:00:15"),
   )
   var seq = Seq(
   (0, "value", dateMap(0), dateSMap(0), UUID.randomUUID.toString)
   )
   for(i <- 501 to 1000) {
   seq :+= (i, "value", dateMap(i % 3), dateSMap(i % 3), 
UUID.randomUUID.toString)
   }
   val df = seq.toDF("id", "string_column", "timestamp_column", "date_string", 
"uuid")
   ```
   
   Run the spark shell(the one taken from hudi quick start page and I am using 
spark version `spark-3.0.1-bin-hadoop2.7`):
   ```
   ./spark-shell --packages 
org.apache.hudi:hudi-spark-bundle_2.12:0.7.0,org.apache.spark:spark-avro_2.12:3.0.1
 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
   ```
   Copy the above script in there and hit 
`df.write.format("hudi").options(writeConfigs).mode(Overwrite).save(basePath)` 
4 times and on the fifth time, it throws the anticipated `Caused by: 
java.lang.IllegalArgumentException: Positive number of partitions required` 
issue.
   
   Now, the thing is, we _still_ have to _manually_ delete the first commit 
file, which contains the empty `partitionToReplaceFileIds`; otherwise, it would 
still keep throwing the `Positive number of partitions required issue. error.`
   The `"hoodie.embed.timeline.server" -> "false"` _does_ help as it forces the 
write to refresh its timeline so we wouldn't see the second error again, which 
is `java.io.FileNotFoundException: 
/.hoodie/20210403201659.replacecommit does not exist`
   However, it appears `"hoodie.embed.timeline.server" -> "false"` to be not 
_quite_ necessary since the _6th_ time we write, the writer is automatically 
being refreshed with the _newest_ timeline and it will put all `*replacecommit` 
files back to a status of integrity again. 
   
   If we fix the empty `partitionToReplaceFileIds` issue, we might not need to 
dig into the `replacecommit does not exist` issue anymore since it is caused by 
the workaround of _manually_ deleting the empty commit file. It would fix 
everything from the start. However, I would still be curious to learn about 
_why_ we would need a `reset` of the timeline server within the `close` action 
upon the `HoodieTableFileSystemView`. It appears unnecessary to me and could be 
removed if there is no strong reason behind it. 
   
   The `reset` within `close` was originally introduced in #600 after a bit of 
digging in that code. I hope that helps you narrow down the scope a little bit. 
Maybe @bvaradar could explain it if the memory is still fresh to you since that 
PR is about 2 years ago from now.  Thanks.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] codecov-io edited a comment on pull request #2757: [HUDI-1757] Assigns the buckets by record key for Flink writer

2021-04-03 Thread GitBox



codecov-io edited a comment on pull request #2757:
URL: https://github.com/apache/hudi/pull/2757#issuecomment-812247500


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2757?src=pr=h1) Report
   > Merging 
[#2757](https://codecov.io/gh/apache/hudi/pull/2757?src=pr=desc) (33aa3f7) 
into 
[master](https://codecov.io/gh/apache/hudi/commit/9804662bc8e17d6936c20326f17ec7c0360dcaf6?el=desc)
 (9804662) will **decrease** coverage by `42.74%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/2757/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2757?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2757   +/-   ##
   
   - Coverage 52.12%   9.38%   -42.75% 
   + Complexity 3646  48 -3598 
   
 Files   480  54  -426 
 Lines 228671993-20874 
 Branches   2417 236 -2181 
   
   - Hits  11920 187-11733 
   + Misses 99161793 -8123 
   + Partials   1031  13 -1018 
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.38% <ø> (-60.36%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/2757?src=pr=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | 
[...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | 
[...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | 
[...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | 
[...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | 
[...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2757/diff?src=pr=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=)
 | `0.00% <0.00%>

[GitHub] [hudi] Sugamber commented on issue #2637: [SUPPORT] - Partial Update : update few columns of a table

[GitHub] [hudi] nsivabalan commented on pull request #2762: [HUDI-1716] Regenerating schema with default null values as applicable

[jira] [Updated] (HUDI-1716) rt view w/ MOR tables fails after schema evolution

[GitHub] [hudi] nsivabalan opened a new pull request #2762: [HUDI-1716] Regenerating schema with default null values as applicable

[GitHub] [hudi] ssdong commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

[GitHub] [hudi] rubenssoto commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

[GitHub] [hudi] rubenssoto commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

[GitHub] [hudi] rubenssoto commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

[GitHub] [hudi] rubenssoto commented on pull request #2627: [HUDI-1653] Add support for composite keys in NonpartitionedKeyGenerator

[GitHub] [hudi] pratyakshsharma commented on issue #2760: [SUPPORT] Possibly Incorrect Documentation

[jira] [Created] (HUDI-1760) Incorrect Documentation for HoodieWriteConfigs

[jira] [Commented] (HUDI-1741) Row Level TTL Support for records stored in Hudi

[GitHub] [hudi] pratyakshsharma removed a comment on pull request #2744: [HUDI-1742]improve table level config priority

[GitHub] [hudi] pratyakshsharma commented on pull request #2744: [HUDI-1742]improve table level config priority

[jira] [Updated] (HUDI-1742) improve table level config priority in HoodieMultiTableDeltaStreamer

[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2744: [HUDI-1742]improve table level config priority

[GitHub] [hudi] ssdong edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

[GitHub] [hudi] ssdong edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

[GitHub] [hudi] ssdong edited a comment on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

[GitHub] [hudi] ssdong commented on issue #2707: [SUPPORT] insert_ovewrite_table failed on archiving

[GitHub] [hudi] codecov-io edited a comment on pull request #2757: [HUDI-1757] Assigns the buckets by record key for Flink writer

21 matches

Site Navigation

Mail list logo

Footer information