[GitHub] [incubator-hudi] eisig edited a comment on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-17 Thread GitBox
eisig edited a comment on issue #789: HoodieMergeOnReadTable rollback hangs
URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-512675011
 
 
   The timeline seems not work as expect . 
   I rerun the demo https://hudi.apache.org/docker_demo.html
   Step 6(a) the ro and rt view return the same result.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] eisig commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-17 Thread GitBox
eisig commented on issue #789: HoodieMergeOnReadTable rollback hangs
URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-512675011
 
 
   The timeline seems not work as expect . 
   I rerun the demo https://hudi.apache.org/docker_demo.html
   Step 6(a) the ro and rt view is same.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] thesuperzapper commented on a change in pull request #780: Fixes HUDI-172 : Cleanup Maven POM/Classpath

2019-07-17 Thread GitBox
thesuperzapper commented on a change in pull request #780: Fixes HUDI-172 : 
Cleanup Maven POM/Classpath
URL: https://github.com/apache/incubator-hudi/pull/780#discussion_r304743409
 
 

 ##
 File path: hoodie-integ-test/pom.xml
 ##
 @@ -6,35 +6,62 @@
 0.4.8-SNAPSHOT
 ../pom.xml
   
-  hoodie-integ-test
   4.0.0
+
+  hoodie-integ-test
+
   
-
-  org.glassfish.jersey.connectors
-  jersey-apache-connector
-  2.17
-
+
+
 
   org.glassfish.jersey.core
   jersey-server
-  2.17
+
+
+  org.glassfish.jersey.connectors
+  jersey-apache-connector
 
 
   org.glassfish.jersey.containers
   jersey-container-servlet-core
-  2.17
 
+
+
+
 
 Review comment:
   There are a lot of dependencies. But in general my approach has been to 
ensure we get the versions of packages which we specify in the base pom. 
(Unless there is some issue which requires precedence, in which case I have 
usually left a comment) 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] thesuperzapper commented on a change in pull request #780: Fixes HUDI-172 : Cleanup Maven POM/Classpath

2019-07-17 Thread GitBox
thesuperzapper commented on a change in pull request #780: Fixes HUDI-172 : 
Cleanup Maven POM/Classpath
URL: https://github.com/apache/incubator-hudi/pull/780#discussion_r304743065
 
 

 ##
 File path: hoodie-hadoop-mr/pom.xml
 ##
 @@ -118,11 +127,6 @@
   
 
   
-
 
 Review comment:
   I think this was a mistake.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] garyli1019 opened a new pull request #795: HUDI-171 delete tmp file after split merge failure

2019-07-17 Thread GitBox
garyli1019 opened a new pull request #795: HUDI-171 delete tmp file after split 
merge failure
URL: https://github.com/apache/incubator-hudi/pull/795
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] cdmikechen commented on issue #757: spark-hoodie-bundle using hive-serde to sync hive table(Hive2.3.5)

2019-07-17 Thread GitBox
cdmikechen commented on issue #757: spark-hoodie-bundle using hive-serde to 
sync hive table(Hive2.3.5)
URL: https://github.com/apache/incubator-hudi/issues/757#issuecomment-512618204
 
 
   @xuFabius 你是在集群上运行时候遇到的这个错误,还是在你本地IDE上测试的时候遇到的错误?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #780: Fixes HUDI-172 : Cleanup Maven POM/Classpath

2019-07-17 Thread GitBox
vinothchandar commented on a change in pull request #780: Fixes HUDI-172 : 
Cleanup Maven POM/Classpath
URL: https://github.com/apache/incubator-hudi/pull/780#discussion_r304692830
 
 

 ##
 File path: hoodie-integ-test/pom.xml
 ##
 @@ -43,69 +70,47 @@
   test-jar
   test
 
-
-  org.awaitility
-  awaitility
-  3.1.2
-  test
-
 
   com.uber.hoodie
   hoodie-spark
   ${project.version}
   tests
   test-jar
   test
-  
-
-  org.glassfish.**
-  *
-
-  
-
-
-  com.google.guava
-  guava
-  20.0
-  test
 
+
+
 
   com.fasterxml.jackson.core
   jackson-annotations
-  2.6.4
   test
 
 
   com.fasterxml.jackson.core
   jackson-databind
-  2.6.4
   test
 
 
   com.fasterxml.jackson.datatype
   jackson-datatype-guava
-  2.9.4
+  ${fasterxml.version}
 
 Review comment:
   can't we inherit from parent like usual? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #780: Fixes HUDI-172 : Cleanup Maven POM/Classpath

2019-07-17 Thread GitBox
vinothchandar commented on a change in pull request #780: Fixes HUDI-172 : 
Cleanup Maven POM/Classpath
URL: https://github.com/apache/incubator-hudi/pull/780#discussion_r302762896
 
 

 ##
 File path: hoodie-cli/pom.xml
 ##
 @@ -159,67 +176,51 @@
   spark-sql_2.11
 
 
+
 
-  com.jakewharton.fliptables
-  fliptables
-  1.0.2
+  commons-dbcp
+  commons-dbcp
 
 
 
-  log4j
-  log4j
-  ${log4j.version}
+  org.springframework.shell
+  spring-shell
+  ${spring.shell.version}
 
 
 
-  com.uber.hoodie
-  hoodie-hive
-  ${project.version}
+  de.vandermeer
+  asciitable
+  0.2.5
 
 
 
-  com.uber.hoodie
-  hoodie-client
-  ${project.version}
+  com.jakewharton.fliptables
+  fliptables
+  1.0.2
 
 
+
+  joda-time
+  joda-time
 
 Review comment:
   removed version


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #780: Fixes HUDI-172 : Cleanup Maven POM/Classpath

2019-07-17 Thread GitBox
vinothchandar commented on a change in pull request #780: Fixes HUDI-172 : 
Cleanup Maven POM/Classpath
URL: https://github.com/apache/incubator-hudi/pull/780#discussion_r302763536
 
 

 ##
 File path: hoodie-client/pom.xml
 ##
 @@ -78,6 +79,71 @@
   hoodie-timeline-service
   ${project.version}
 
+
+
+
+  log4j
+  log4j
+
+
+
+
+  org.apache.parquet
+  parquet-avro
+
+
+  org.apache.parquet
 
 Review comment:
   removed already


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar edited a comment on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-17 Thread GitBox
vinothchandar edited a comment on issue #714: Performance Comparison of 
HoodieDeltaStreamer and DataSourceAPI
URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-512610891
 
 
   @NetsanetGeb what time works for you.. are you on slack? we can coordinate 
1-1 there.. Next week works for me. 
   
   I am trying to establish a baseline using the 
TestDataGenerator/DeltaStreamer, in the meantime. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-17 Thread GitBox
vinothchandar commented on issue #714: Performance Comparison of 
HoodieDeltaStreamer and DataSourceAPI
URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-512610891
 
 
   @NetsanetGeb what time works for you.. are you on slack? we can coordinate 
1-1 there.. 
   
   I am trying to establish a baseline using the 
TestDataGenerator/DeltaStreamer, in the meantime. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on issue #689: [HUDI-25] Optimize HoodieInputFormat.listStatus for faster Hive Incremental queries

2019-07-17 Thread GitBox
bhasudha commented on issue #689: [HUDI-25] Optimize 
HoodieInputFormat.listStatus for faster Hive Incremental queries
URL: https://github.com/apache/incubator-hudi/pull/689#issuecomment-512528018
 
 
   I was able to successfully cross verify the query results between the 
current HoodieInputFormat and this new HoodieInputFormat for few Uber 
production tables using spark. I ran different snapshot queries on MOR tables 
that has count(*), group by's,  joins etc. The query latencies were also 
comparable.
   
   For Incremental queries I can't test it yet, without changing the jar in 
Hive MetaStore. I will be doing that next. My plan is to have that tested in 
staging and then gradually rolling it to production. 
   
   @n3nash @vinothchandar ^^


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on a change in pull request #689: [HUDI-25] Optimize HoodieInputFormat.listStatus for faster Hive Incremental queries

2019-07-17 Thread GitBox
bhasudha commented on a change in pull request #689: [HUDI-25] Optimize 
HoodieInputFormat.listStatus for faster Hive Incremental queries
URL: https://github.com/apache/incubator-hudi/pull/689#discussion_r304590801
 
 

 ##
 File path: 
hoodie-hadoop-mr/src/test/java/com/uber/hoodie/hadoop/HoodieInputFormatTest.java
 ##
 @@ -209,6 +231,22 @@ public void testPredicatePushDown() throws IOException {
 commit2, 2, 10);
   }
 
+  @Test
+  public void testgetIncrementalTableNames() throws IOException {
 
 Review comment:
   done!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] garyli1019 commented on issue #768: No Space Left On Device for upsert

2019-07-17 Thread GitBox
garyli1019 commented on issue #768: No Space Left On Device for upsert
URL: https://github.com/apache/incubator-hudi/issues/768#issuecomment-512514758
 
 
   https://issues.apache.org/jira/browse/HUDI-171
   @vinothchandar In my cluster set up, all the spark shuffle services are not 
using `/tmp`, so I think those files are left behind by hudi. 
   Example of a file left in `/tmp`: 
   `-rw-r- 1 u_ops 168M Jul 14 18:10 d7b2a7a3-5706-4ffd-90cb-70c6650ef1e4`
   I think we can find a way to predict the file size before actually writing 
to tmp. It will be difficult to go back to the worker node to delete those 
files after the job failed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar merged pull request #794: Update writing_data for operations/deletes

2019-07-17 Thread GitBox
vinothchandar merged pull request #794: Update writing_data for 
operations/deletes
URL: https://github.com/apache/incubator-hudi/pull/794
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on issue #764: Hoodie 0.4.7: Error upserting bucketType UPDATE for partition #, No value present

2019-07-17 Thread GitBox
bhasudha commented on issue #764: Hoodie 0.4.7:  Error upserting bucketType 
UPDATE for partition #, No value present
URL: https://github.com/apache/incubator-hudi/issues/764#issuecomment-51525
 
 
   With PR [775](https://github.com/apache/incubator-hudi/pull/775) this issue 
seems to have been fixed. I was able to reproduce this error before the fix. 
After applying PR 775 could not reproduce it anymore. @amaranathv can you test 
this PR for empty path exception?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar opened a new pull request #794: Update writing_data for operations/deletes

2019-07-17 Thread GitBox
vinothchandar opened a new pull request #794: Update writing_data for 
operations/deletes
URL: https://github.com/apache/incubator-hudi/pull/794
 
 
- provided guidance for upsert vs insert vs bulk_insert
- provided guidance for soft deletes vs hard deletes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bhasudha commented on issue #793: Allow HoodieWrapperFileSystem to wrap other proxy file-system implementations with no getScheme implementation

2019-07-17 Thread GitBox
bhasudha commented on issue #793: Allow HoodieWrapperFileSystem to wrap other 
proxy file-system implementations with no getScheme implementation
URL: https://github.com/apache/incubator-hudi/pull/793#issuecomment-512213882
 
 
   Looks good to me. @bvaradar Looks like with PR - 
[700](https://github.com/apache/incubator-hudi/pull/700) the 
HoodieWrapperFileSystem is added to HoodieTableMetaClient now. Do we need to 
test other query engines as well for same issue as this touches query side?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #793: Allow HoodieWrapperFileSystem to wrap other proxy file-system implementations with no getScheme implementation

2019-07-17 Thread GitBox
vinothchandar commented on a change in pull request #793: Allow 
HoodieWrapperFileSystem to wrap other proxy file-system implementations with no 
getScheme implementation
URL: https://github.com/apache/incubator-hudi/pull/793#discussion_r304344989
 
 

 ##
 File path: 
hoodie-common/src/main/java/com/uber/hoodie/common/io/storage/HoodieWrapperFileSystem.java
 ##
 @@ -121,13 +121,15 @@ public void initialize(URI uri, Configuration conf) 
throws IOException {
 // Remove 'hoodie-' prefix from path
 if (path.toString().startsWith(HOODIE_SCHEME_PREFIX)) {
   path = new Path(path.toString().replace(HOODIE_SCHEME_PREFIX, ""));
+  this.uri = path.toUri();
+} else {
+  this.uri = uri;
 }
 this.fileSystem = FSUtils.getFs(path.toString(), conf);
 // Do not need to explicitly initialize the default filesystem, its done 
already in the above
 
 Review comment:
   do we clean up lines 127-129?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #793: Allow HoodieWrapperFileSystem to wrap other proxy file-system implementations with no getScheme implementation

2019-07-17 Thread GitBox
vinothchandar commented on issue #793: Allow HoodieWrapperFileSystem to wrap 
other proxy file-system implementations with no getScheme implementation
URL: https://github.com/apache/incubator-hudi/pull/793#issuecomment-512203745
 
 
   @bhasudha can you also please review this


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar edited a comment on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-17 Thread GitBox
vinothchandar edited a comment on issue #789: HoodieMergeOnReadTable rollback 
hangs
URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-512201473
 
 
   In (2), it seems like you are not seeing the delta commit data reflected? 
again how do we reconcile this with duplicates you were reporting on #779 ? do 
you think they are related? What query engine are you using in (2) to query rt 
table? (I noticed some issues in sparkSQL when running demo; trying to see if 
thats related)..  
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #779: HoodieDeltaStreamer may insert duplicate record?

2019-07-17 Thread GitBox
vinothchandar commented on issue #779: HoodieDeltaStreamer may insert duplicate 
record?
URL: https://github.com/apache/incubator-hudi/issues/779#issuecomment-512202751
 
 
   >here are multiple log files with the file id = 
c87d3580-86fe-40f9-8f6c-7c95cc91caa6 but I don't see a corresponding parquet 
file 
   
   we have to drill into whats causing this. at the moment, we don't index the 
log files, so we expect only updates to go there.. if not, it will result in 
duplicates.. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #789: HoodieMergeOnReadTable rollback hangs

2019-07-17 Thread GitBox
vinothchandar commented on issue #789: HoodieMergeOnReadTable rollback hangs
URL: https://github.com/apache/incubator-hudi/issues/789#issuecomment-512201473
 
 
   In (2), it seems like you are not seeing the delta commit data reflected? 
again how do we reconcile this with duplicates you were reporting on #779 ? do 
you think they are related?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null (#792)

2019-07-17 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 6efa163  Fixing default value for avro 1.7 which assumes NULL value 
instead of a jsonnode that is null (#792)
6efa163 is described below

commit 6efa16317c0f0f13798d739d9615dda24bf91bcf
Author: n3nash 
AuthorDate: Wed Jul 17 03:25:54 2019 -0700

Fixing default value for avro 1.7 which assumes NULL value instead of a 
jsonnode that is null (#792)
---
 .../java/com/uber/hoodie/common/util/HoodieAvroUtils.java  | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git 
a/hoodie-common/src/main/java/com/uber/hoodie/common/util/HoodieAvroUtils.java 
b/hoodie-common/src/main/java/com/uber/hoodie/common/util/HoodieAvroUtils.java
index 9b34fab..3a9443e 100644
--- 
a/hoodie-common/src/main/java/com/uber/hoodie/common/util/HoodieAvroUtils.java
+++ 
b/hoodie-common/src/main/java/com/uber/hoodie/common/util/HoodieAvroUtils.java
@@ -102,15 +102,15 @@ public class HoodieAvroUtils {
 List parentFields = new ArrayList<>();
 
 Schema.Field commitTimeField = new 
Schema.Field(HoodieRecord.COMMIT_TIME_METADATA_FIELD,
-METADATA_FIELD_SCHEMA, "", null);
+METADATA_FIELD_SCHEMA, "", NullNode.getInstance());
 Schema.Field commitSeqnoField = new 
Schema.Field(HoodieRecord.COMMIT_SEQNO_METADATA_FIELD,
-METADATA_FIELD_SCHEMA, "", null);
+METADATA_FIELD_SCHEMA, "", NullNode.getInstance());
 Schema.Field recordKeyField = new 
Schema.Field(HoodieRecord.RECORD_KEY_METADATA_FIELD,
-METADATA_FIELD_SCHEMA, "", null);
+METADATA_FIELD_SCHEMA, "", NullNode.getInstance());
 Schema.Field partitionPathField = new 
Schema.Field(HoodieRecord.PARTITION_PATH_METADATA_FIELD,
-METADATA_FIELD_SCHEMA, "", null);
+METADATA_FIELD_SCHEMA, "", NullNode.getInstance());
 Schema.Field fileNameField = new 
Schema.Field(HoodieRecord.FILENAME_METADATA_FIELD,
-METADATA_FIELD_SCHEMA, "", null);
+METADATA_FIELD_SCHEMA, "", NullNode.getInstance());
 
 parentFields.add(commitTimeField);
 parentFields.add(commitSeqnoField);
@@ -119,7 +119,7 @@ public class HoodieAvroUtils {
 parentFields.add(fileNameField);
 for (Schema.Field field : schema.getFields()) {
   if (!isMetadataField(field.name())) {
-Schema.Field newField = new Schema.Field(field.name(), field.schema(), 
field.doc(), null);
+Schema.Field newField = new Schema.Field(field.name(), field.schema(), 
field.doc(), field.defaultValue());
 for (Map.Entry prop : 
field.getJsonProps().entrySet()) {
   newField.addProp(prop.getKey(), prop.getValue());
 }
@@ -135,7 +135,7 @@ public class HoodieAvroUtils {
 
   private static Schema initRecordKeySchema() {
 Schema.Field recordKeyField = new 
Schema.Field(HoodieRecord.RECORD_KEY_METADATA_FIELD,
-METADATA_FIELD_SCHEMA, "", null);
+METADATA_FIELD_SCHEMA, "", NullNode.getInstance());
 Schema recordKeySchema = Schema.createRecord("HoodieRecordKey", "", "", 
false);
 recordKeySchema.setFields(Arrays.asList(recordKeyField));
 return recordKeySchema;



[GitHub] [incubator-hudi] vinothchandar merged pull request #792: Fixing default value for avro 1.7 which assumes NULL value instead of a jsonnode that is null

2019-07-17 Thread GitBox
vinothchandar merged pull request #792: Fixing default value for avro 1.7 which 
assumes NULL value instead of a jsonnode that is null
URL: https://github.com/apache/incubator-hudi/pull/792
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] NetsanetGeb commented on issue #714: Performance Comparison of HoodieDeltaStreamer and DataSourceAPI

2019-07-17 Thread GitBox
NetsanetGeb commented on issue #714: Performance Comparison of 
HoodieDeltaStreamer and DataSourceAPI
URL: https://github.com/apache/incubator-hudi/issues/714#issuecomment-512140516
 
 
   Yes, you can extract  data from [IPUMS USA](https://usa.ipums.org/usa/)  to 
run the workload locally.  I am not allowed to share the files i downloaded 
from there. Hence, You can extract the dataset from their site by specifying 
the column fields that you want in a csv fromat and later change it to JSON for 
using JSON as a source class. 
Am also glad to do a video call  on time thats convenient for the both of 
us may be on weekends or next week to debug it together.  Thanks,


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services