[GitHub] [incubator-hudi] leesf opened a new pull request #856: [HUDI-222] Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread GitBox
leesf opened a new pull request #856: [HUDI-222] Rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh
URL: https://github.com/apache/incubator-hudi/pull/856
 
 
   see 
jira:[HUDI-222](https://jira.apache.org/jira/projects/HUDI/issues/HUDI-222)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-222) Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-222:

Labels: pull-request-available  (was: )

> Rename main class path to org.apache.hudi.timeline.service.TimelineService in 
> run_server.sh
> ---
>
> Key: HUDI-222
> URL: https://issues.apache.org/jira/browse/HUDI-222
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>Reporter: leesf
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
>
> The current main class path in run_server.sh is
> {code:java}
> com.uber.hoodie.timeline.service.TimelineService
> {code}
> , however, it should changed be
> {code:java}
>  org.apache.hudi.timeline.service.TimelineService{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] leesf opened a new pull request #855: [HUDI-222] Rename main class path to org.apache.hudi.timeline.serviceTimelineService in run_server.sh

2019-08-28 Thread GitBox
leesf opened a new pull request #855: [HUDI-222] Rename main class path to 
org.apache.hudi.timeline.serviceTimelineService in run_server.sh
URL: https://github.com/apache/incubator-hudi/pull/855
 
 
   see jira: 
[HUDI-222](https://jira.apache.org/jira/projects/HUDI/issues/HUDI-222)
   
   Rename main class path to org.apache.hudi.timeline.serviceTimelineService in 
run_server.sh


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] leesf closed pull request #855: [HUDI-222] Rename main class path to org.apache.hudi.timeline.serviceTimelineService in run_server.sh

2019-08-28 Thread GitBox
leesf closed pull request #855: [HUDI-222] Rename main class path to 
org.apache.hudi.timeline.serviceTimelineService in run_server.sh
URL: https://github.com/apache/incubator-hudi/pull/855
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-222) rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread leesf (Jira)
leesf created HUDI-222:
--

 Summary: rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh
 Key: HUDI-222
 URL: https://issues.apache.org/jira/browse/HUDI-222
 Project: Apache Hudi (incubating)
  Issue Type: Bug
Reporter: leesf
Assignee: leesf


The current main class path in run_server.sh is
{code:java}
com.uber.hoodie.timeline.service.TimelineService
{code}
, however, it should changed be
{code:java}
 org.apache.hudi.timeline.service.TimelineService{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HUDI-222) Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf updated HUDI-222:
---
Summary: Rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh  (was: rename 
main class path to org.apache.hudi.timeline.service.TimelineService in 
run_server.sh)

> Rename main class path to org.apache.hudi.timeline.service.TimelineService in 
> run_server.sh
> ---
>
> Key: HUDI-222
> URL: https://issues.apache.org/jira/browse/HUDI-222
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>Reporter: leesf
>Assignee: leesf
>Priority: Major
>
> The current main class path in run_server.sh is
> {code:java}
> com.uber.hoodie.timeline.service.TimelineService
> {code}
> , however, it should changed be
> {code:java}
>  org.apache.hudi.timeline.service.TimelineService{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Build failed in Jenkins: hudi-snapshot-deployment-0.5 #18

2019-08-28 Thread Apache Jenkins Server
See 


--
Started by user vbalaji
[EnvInject] - Loading node environment variables.
Building remotely on H37 (ubuntu xenial) in workspace 

No credentials specified
Wiping out workspace first.
Cloning the remote Git repository
Using shallow clone
Cloning repository https://github.com/apache/incubator-hudi.git
 > git init  # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-hudi.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/incubator-hudi.git 
 > +refs/heads/*:refs/remotes/origin/* --depth=1
 > git config remote.origin.url https://github.com/apache/incubator-hudi.git # 
 > timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/incubator-hudi.git # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-hudi.git
 > git fetch --tags --progress https://github.com/apache/incubator-hudi.git 
 > +refs/heads/*:refs/remotes/origin/* --depth=1
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision e0ab89b3ac22207ff45cf3cae782d64b8be01bf1 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e0ab89b3ac22207ff45cf3cae782d64b8be01bf1
Commit message: "[HUDI-223] Adding a way to infer target schema from the 
dataset after the transformation (#854)"
 > git rev-list --no-walk 78e07215070106176902b57108a5689dce3cf606 # timeout=10
First time build. Skipping changelog.
Setting MAVEN_3_2_5_HOME=/home/jenkins/tools/maven/apache-maven-3.2.5
[hudi-snapshot-deployment-0.5] $ /bin/bash /tmp/jenkins8886319038641870916.sh

Detected current version as: 
'HUDI_home=
[INFO] Scanning for projects...
Downloading: 
https://repo.maven.apache.org/maven2/com/spotify/dockerfile-maven-extension/1.4.3/dockerfile-maven-extension-1.4.3.pom
605/605 B   Downloaded: 
https://repo.maven.apache.org/maven2/com/spotify/dockerfile-maven-extension/1.4.3/dockerfile-maven-extension-1.4.3.pom
 (605 B at 2.1 KB/sec)
Downloading: 
https://repo.maven.apache.org/maven2/com/spotify/dockerfile-maven/1.4.3/dockerfile-maven-1.4.3.pom
3/3 KB   Downloaded: 
https://repo.maven.apache.org/maven2/com/spotify/dockerfile-maven/1.4.3/dockerfile-maven-1.4.3.pom
 (3 KB at 108.2 KB/sec)
Downloading: 
https://repo.maven.apache.org/maven2/com/spotify/dockerfile-maven-extension/1.4.3/dockerfile-maven-extension-1.4.3.jar
3/3 KB   3/3 KBDownloaded: 
https://repo.maven.apache.org/maven2/com/spotify/dockerfile-maven-extension/1.4.3/dockerfile-maven-extension-1.4.3.jar
 (3 KB at 141.3 KB/sec)
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Hudi
[INFO] hudi-common
[INFO] hudi-timeline-service
[INFO] hudi-hadoop-mr
[INFO] hudi-client
[INFO] hudi-hive
[INFO] hudi-spark
[INFO] hudi-utilities
[INFO] hudi-cli
[INFO] hudi-hadoop-mr-bundle
[INFO] hudi-hive-bundle
[INFO] hudi-spark-bundle
[INFO] hudi-presto-bundle
[INFO] hudi-utilities-bundle
[INFO] hudi-hadoop-docker
[INFO] hudi-hadoop-base-docker
[INFO] hudi-hadoop-namenode-docker
[INFO] hudi-hadoop-datanode-docker
[INFO] hudi-hadoop-history-docker
[INFO] hudi-hadoop-hive-docker
[INFO] hudi-hadoop-sparkbase-docker
[INFO] hudi-hadoop-sparkmaster-docker
[INFO] hudi-hadoop-sparkworker-docker
[INFO] hudi-hadoop-sparkadhoc-docker
[INFO] hudi-hadoop-presto-docker
[INFO] hudi-integ-test
[INFO] 
[INFO] 
[INFO] Building Hudi 0.5.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-help-plugin:2.1.1:evaluate (default-cli) @ hudi ---
[INFO] No artifact parameter specified, using 
'org.apache.hudi:hudi:pom:0.5.0-SNAPSHOT' as project.
Downloading: 
https://repo.maven.apache.org/maven2/org/jasig/maven/maven-notice-plugin/1.1.0/maven-notice-plugin-1.1.0.pom
3/16 KB   5/16 KB   8/16 KB   11/16 KB   13/16 KB   16/16 KB   16/16 KB 
 Downloaded: 
https://repo.maven.apache.org/maven2/org/jasig/maven/maven-notice-plugin/1.1.0/maven-notice-plugin-1.1.0.pom
 (16 KB at 520.5 KB/sec)
Downloading: 
https://repo.maven.apache.org/maven2/org/jasig/parent/jasig-parent/41/jasig-parent-41.pom
3/18 KB5/18 KB   8/18 KB   11/18 KB   13/18 KB   16/18 KB   18/18 KB
  Downloaded: 
https://repo.maven.apache.org/maven2/org/jasig/parent/jasig-parent/41/jasig-parent-41.pom
 (18 KB at 882.9 KB/sec)
Downloading: 

[jira] [Created] (HUDI-223) Allow for simply using the the Spark Row schema in DeltaSync

2019-08-28 Thread Vinoth Chandar (Jira)
Vinoth Chandar created HUDI-223:
---

 Summary: Allow for simply using the the Spark Row schema in 
DeltaSync 
 Key: HUDI-223
 URL: https://issues.apache.org/jira/browse/HUDI-223
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: deltastreamer
Reporter: Vinoth Chandar






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] taherk77 commented on issue #849: [HUDI-62] Added index handle times

2019-08-28 Thread GitBox
taherk77 commented on issue #849: [HUDI-62] Added index handle times
URL: https://github.com/apache/incubator-hudi/pull/849#issuecomment-525739928
 
 
   > @taherk77 you can click on "Details" and it will take you to the job 
page.. there you can click again on the failing job and look at the logs
   > 
   > ```
   >  [WARN] 
/home/travis/build/apache/incubator-hudi/hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java:175:25:
 WhitespaceAround: '=' is not preceded with whitespace. [WhitespaceAround]
   > [WARN] 
/home/travis/build/apache/incubator-hudi/hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java:175:26:
 WhitespaceAround: '=' is not followed by whitespace. Empty blocks may only be 
represented as {} when not part of a multi-block statement (4.1.3) 
[WhitespaceAround]
   > [WARN] 
/home/travis/build/apache/incubator-hudi/hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java:177:
 Line is longer than 120 characters (found 133). [LineLength]
   > ```
   > 
   > Looks like you have checkstyle failures.. Typically, people first build, 
run the tests locally , get them to pass and then push it to gh. may be we can 
try that?
   
   Hey Sorry for that @vinothchandar, I did run all the junits before pushing 
the code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #856: [HUDI-222] Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread GitBox
vinothchandar commented on issue #856: [HUDI-222] Rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh
URL: https://github.com/apache/incubator-hudi/pull/856#issuecomment-525701454
 
 
   Rekicked off travis. Seems like the jvm just got killed. Not sure why. 
   
   This looks good. Thanks for catching this! @bvaradar this script probably 
does not work atm, since timeline-service is not a fat jar anymore? do we need 
a JIRA to track this? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: [HUDI-223] Adding a way to infer target schema from the dataset after the transformation (#854)

2019-08-28 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new e0ab89b  [HUDI-223] Adding a way to infer target schema from the 
dataset after the transformation (#854)
e0ab89b is described below

commit e0ab89b3ac22207ff45cf3cae782d64b8be01bf1
Author: Alexander Filipchik 
AuthorDate: Wed Aug 28 04:48:38 2019 -0700

[HUDI-223] Adding a way to infer target schema from the dataset after the 
transformation (#854)

- Adding a way to decouple target and source schema providers
- Adding flattening transformer
---
 .../hudi/utilities/deltastreamer/DeltaSync.java| 20 --
 .../schema/NullTargetSchemaRegistryProvider.java   | 40 +++
 .../utilities/transform/FlatteningTransformer.java | 83 ++
 .../hudi/utilities/TestFlatteningTransformer.java  | 56 +++
 4 files changed, 194 insertions(+), 5 deletions(-)

diff --git 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
index 075e1c9..b093010 100644
--- 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
+++ 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
@@ -24,7 +24,7 @@ import static 
org.apache.hudi.utilities.schema.RowBasedSchemaProvider.HOODIE_REC
 import com.codahale.metrics.Timer;
 import java.io.IOException;
 import java.io.Serializable;
-import java.util.Arrays;
+import java.util.ArrayList;
 import java.util.HashMap;
 import java.util.List;
 import java.util.function.Function;
@@ -282,9 +282,14 @@ public class DeltaSync implements Serializable {
   AvroConversionUtils.createRdd(t, HOODIE_RECORD_STRUCT_NAME, 
HOODIE_RECORD_NAMESPACE).toJavaRDD()
   );
   // Use Transformed Row's schema if not overridden
+  // Use Transformed Row's schema if not overridden. If target schema is 
not specified
+  // default to RowBasedSchemaProvider
   schemaProvider =
-  this.schemaProvider == null ? transformed.map(r -> (SchemaProvider) 
new RowBasedSchemaProvider(r.schema()))
-  .orElse(dataAndCheckpoint.getSchemaProvider()) : 
this.schemaProvider;
+  this.schemaProvider == null || this.schemaProvider.getTargetSchema() 
== null
+  ? transformed
+  .map(r -> (SchemaProvider) new 
RowBasedSchemaProvider(r.schema()))
+  .orElse(dataAndCheckpoint.getSchemaProvider())
+  : this.schemaProvider;
 } else {
   // Pull the data from the source & prepare the write
   InputBatch> dataAndCheckpoint =
@@ -472,7 +477,7 @@ public class DeltaSync implements Serializable {
 .forTable(cfg.targetTableName)
 
.withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(HoodieIndex.IndexType.BLOOM).build())
 .withAutoCommit(false);
-if (null != schemaProvider) {
+if (null != schemaProvider && null != schemaProvider.getTargetSchema()) {
   builder = 
builder.withSchema(schemaProvider.getTargetSchema().toString());
 }
 
@@ -487,7 +492,12 @@ public class DeltaSync implements Serializable {
   private void registerAvroSchemas(SchemaProvider schemaProvider) {
 // register the schemas, so that shuffle does not serialize the full 
schemas
 if (null != schemaProvider) {
-  List schemas = Arrays.asList(schemaProvider.getSourceSchema(), 
schemaProvider.getTargetSchema());
+  List schemas = new ArrayList<>();
+  schemas.add(schemaProvider.getSourceSchema());
+  if (schemaProvider.getTargetSchema() != null) {
+schemas.add(schemaProvider.getTargetSchema());
+  }
+
   log.info("Registering Schema :" + schemas);
   
jssc.sc().getConf().registerAvroSchemas(JavaConversions.asScalaBuffer(schemas).toList());
 }
diff --git 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/NullTargetSchemaRegistryProvider.java
 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/NullTargetSchemaRegistryProvider.java
new file mode 100644
index 000..109b499
--- /dev/null
+++ 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/NullTargetSchemaRegistryProvider.java
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software

[jira] [Updated] (HUDI-223) Allow for simply using the the Spark Row schema in DeltaSync

2019-08-28 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-223:

Status: Patch Available  (was: In Progress)

> Allow for simply using the the Spark Row schema in DeltaSync 
> -
>
> Key: HUDI-223
> URL: https://issues.apache.org/jira/browse/HUDI-223
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (HUDI-223) Allow for simply using the the Spark Row schema in DeltaSync

2019-08-28 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar reassigned HUDI-223:
---

Assignee: Vinoth Chandar

> Allow for simply using the the Spark Row schema in DeltaSync 
> -
>
> Key: HUDI-223
> URL: https://issues.apache.org/jira/browse/HUDI-223
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (HUDI-128) Setup infra for nightly/snapshot releases

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN resolved HUDI-128.
-
Resolution: Fixed

> Setup infra for nightly/snapshot releases
> -
>
> Key: HUDI-128
> URL: https://issues.apache.org/jira/browse/HUDI-128
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: Vinoth Chandar
>Assignee: BALAJI VARADARAJAN
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Reopened] (HUDI-124) Ensure third party libs are compatible with ASF policy

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN reopened HUDI-124:
-

Need to update pom file w.r.t to license compatibility

> Ensure third party libs are compatible with ASF policy
> --
>
> Key: HUDI-124
> URL: https://issues.apache.org/jira/browse/HUDI-124
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: Vinoth Chandar
>Assignee: BALAJI VARADARAJAN
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [http://www.apache.org/legal/resolved.html] 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] bvaradar opened a new pull request #858: HUDI-124 : Exclude jdk.tools from hadoop-common and update Notice files

2019-08-28 Thread GitBox
bvaradar opened a new pull request #858: HUDI-124 : Exclude jdk.tools from 
hadoop-common and update Notice files
URL: https://github.com/apache/incubator-hudi/pull/858
 
 
   Updating NOTICE.txt to be in sync with latest master.  Added exclusion in 
hadoop-common dependency to exclude jdk.tools 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #853: Ignore duplicate of a compaction file

2019-08-28 Thread GitBox
vinothchandar commented on a change in pull request #853: Ignore duplicate of a 
compaction file
URL: https://github.com/apache/incubator-hudi/pull/853#discussion_r318532604
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/CompactionUtils.java
 ##
 @@ -150,11 +150,19 @@ public static HoodieCompactionPlan 
getCompactionPlan(HoodieTableMetaClient metaC
 pendingCompactionPlanWithInstants.stream().flatMap(instantPlanPair -> {
   return getPendingCompactionOperations(instantPlanPair.getKey(), 
instantPlanPair.getValue());
 }).forEach(pair -> {
-  // Defensive check to ensure a single-fileId does not have more than one 
pending compaction
+  // Defensive check to ensure a single-fileId does not have more than one 
pending compaction with different
+  // file slices. If we find a full duplicate we assume it is caused by 
eventual nature of the move operation
+  // on some DFSs.
   if (fgIdToPendingCompactionWithInstantMap.containsKey(pair.getKey())) {
-String msg = "Hoodie File Id (" + pair.getKey() + ") has more thant 1 
pending compactions. Instants: "
-+ pair.getValue() + ", " + 
fgIdToPendingCompactionWithInstantMap.get(pair.getKey());
-throw new IllegalStateException(msg);
+HoodieCompactionOperation operation = pair.getValue().getValue();
+HoodieCompactionOperation anotherOperation =
+
fgIdToPendingCompactionWithInstantMap.get(pair.getKey()).getValue();
+
+if (!operation.equals(anotherOperation)) {
 
 Review comment:
   Would the `  @Deprecated public 
java.util.Map metrics;` field be also same? 
@bvaradar if you can quickly confirm that, please go ahead merge. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-223) Allow for simply using the the Spark Row schema in DeltaSync

2019-08-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-223:

Labels: pull-request-available  (was: )

> Allow for simply using the the Spark Row schema in DeltaSync 
> -
>
> Key: HUDI-223
> URL: https://issues.apache.org/jira/browse/HUDI-223
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #853: [HUDI-174] Ignore duplicate of a compaction file

2019-08-28 Thread GitBox
bvaradar commented on a change in pull request #853: [HUDI-174] Ignore 
duplicate of a compaction file
URL: https://github.com/apache/incubator-hudi/pull/853#discussion_r318574724
 
 

 ##
 File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/CompactionUtils.java
 ##
 @@ -150,11 +150,19 @@ public static HoodieCompactionPlan 
getCompactionPlan(HoodieTableMetaClient metaC
 pendingCompactionPlanWithInstants.stream().flatMap(instantPlanPair -> {
   return getPendingCompactionOperations(instantPlanPair.getKey(), 
instantPlanPair.getValue());
 }).forEach(pair -> {
-  // Defensive check to ensure a single-fileId does not have more than one 
pending compaction
+  // Defensive check to ensure a single-fileId does not have more than one 
pending compaction with different
+  // file slices. If we find a full duplicate we assume it is caused by 
eventual nature of the move operation
+  // on some DFSs.
   if (fgIdToPendingCompactionWithInstantMap.containsKey(pair.getKey())) {
-String msg = "Hoodie File Id (" + pair.getKey() + ") has more thant 1 
pending compactions. Instants: "
-+ pair.getValue() + ", " + 
fgIdToPendingCompactionWithInstantMap.get(pair.getKey());
-throw new IllegalStateException(msg);
+HoodieCompactionOperation operation = pair.getValue().getValue();
+HoodieCompactionOperation anotherOperation =
+
fgIdToPendingCompactionWithInstantMap.get(pair.getKey()).getValue();
+
+if (!operation.equals(anotherOperation)) {
 
 Review comment:
   This should be fine in this case as we get duplicates because of reading the 
same compaction plan in 2 different files (due to non-atomic rename)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] bvaradar merged pull request #853: [HUDI-174] Ignore duplicate of a compaction file

2019-08-28 Thread GitBox
bvaradar merged pull request #853: [HUDI-174] Ignore duplicate of a compaction 
file
URL: https://github.com/apache/incubator-hudi/pull/853
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-174) Investigate atomicity guarantees out of cloud storage

2019-08-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-174:

Labels: pull-request-available  (was: )

> Investigate atomicity guarantees out of cloud storage
> -
>
> Key: HUDI-174
> URL: https://issues.apache.org/jira/browse/HUDI-174
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Write Client
>Reporter: Vinoth Chandar
>Assignee: BALAJI VARADARAJAN
>Priority: Major
>  Labels: pull-request-available
>
> Bug report :
> we are getting a "File Id has more than 1 pending compaction" error. How 
> would I go about resolving this? (still learning hudi). Here is the stack 
> trace:
> {code}
> com.facebook.presto.spi.PrestoException: Hoodie File Id 
> (HoodieFileGroupId{partitionPath='2019/07/17', 
> fileId='ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0'}) has more thant 1 pending 
> compactions. Instants: (20190718213318,{"baseInstantTime": "20190718212749", 
> "deltaFilePaths": 
> ["gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/.ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_20190718212749.log.1_1-21-4523"],
>  "dataFilePath": 
> "gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_0-35-4549_20190718212749.parquet",
>  "fileId": "ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0", "partitionPath": 
> "2019\/07\/17", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 3.0, 
> "TOTAL_LOG_FILES_SIZE": 84287.0, "TOTAL_IO_WRITE_MB": 3.0, "TOTAL_IO_MB": 
> 6.0, "TOTAL_LOG_FILE_SIZE": 84287.0}}), (20190718213318,{"baseInstantTime": 
> "20190718212749", "deltaFilePaths": 
> ["gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/.ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_20190718212749.log.1_1-21-4523"],
>  "dataFilePath": 
> "gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_0-35-4549_20190718212749.parquet",
>  "fileId": "ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0", "partitionPath": 
> "2019\/07\/17", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 3.0, 
> "TOTAL_LOG_FILES_SIZE": 84287.0, "TOTAL_IO_WRITE_MB": 3.0, "TOTAL_IO_MB": 
> 6.0, "TOTAL_LOG_FILE_SIZE": 84287.0}})
>   at 
> com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:192)
>   at 
> com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:47)
>   at 
> com.facebook.presto.hive.util.ResumableTasks.access$000(ResumableTasks.java:20)
>   at 
> com.facebook.presto.hive.util.ResumableTasks$1.run(ResumableTasks.java:35)
>   at 
> io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Hoodie File Id 
> (HoodieFileGroupId{partitionPath='2019/07/17', 
> fileId='ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0'}) has more thant 1 pending 
> compactions. Instants: (20190718213318,{"baseInstantTime": "20190718212749", 
> "deltaFilePaths": 
> ["gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/.ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_20190718212749.log.1_1-21-4523"],
>  "dataFilePath": 
> "gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_0-35-4549_20190718212749.parquet",
>  "fileId": "ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0", "partitionPath": 
> "2019\/07\/17", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 3.0, 
> "TOTAL_LOG_FILES_SIZE": 84287.0, "TOTAL_IO_WRITE_MB": 3.0, "TOTAL_IO_MB": 
> 6.0, "TOTAL_LOG_FILE_SIZE": 84287.0}}), (20190718213318,{"baseInstantTime": 
> "20190718212749", "deltaFilePaths": 
> ["gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/.ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_20190718212749.log.1_1-21-4523"],
>  "dataFilePath": 
> "gs:\/\/hudi-ingest\/hudi\/data\/hudi_ingest_raw\/updates_latest\/2019\/07\/17\/ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0_0-35-4549_20190718212749.parquet",
>  "fileId": "ba820041-3e16-4fbd-b0a0-9e8ad22ade82-0", "partitionPath": 
> "2019\/07\/17", "metrics": {"TOTAL_LOG_FILES": 1.0, "TOTAL_IO_READ_MB": 3.0, 
> "TOTAL_LOG_FILES_SIZE": 84287.0, "TOTAL_IO_WRITE_MB": 3.0, "TOTAL_IO_MB": 
> 6.0, "TOTAL_LOG_FILE_SIZE": 84287.0}})
>   at 
> com.uber.hoodie.common.util.CompactionUtils.lambda$getAllPendingCompactionOperations$5(CompactionUtils.java:158)
>   at 
> 

[incubator-hudi] branch master updated (e0ab89b -> 41dbac6)

2019-08-28 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from e0ab89b  [HUDI-223] Adding a way to infer target schema from the 
dataset after the transformation (#854)
 add baea4f3  Ignore dublicate of a compaction file
 add b5d4da7  Addressing comments
 add 41dbac6  Fixed unit test

No new revisions were added by this update.

Summary of changes:
 .../org/apache/hudi/common/util/CompactionUtils.java| 16 
 .../apache/hudi/common/util/TestCompactionUtils.java| 17 -
 2 files changed, 28 insertions(+), 5 deletions(-)



Build failed in Jenkins: hudi-snapshot-deployment-0.5 #19

2019-08-28 Thread Apache Jenkins Server
See 


--
Started by user vbalaji
[EnvInject] - Loading node environment variables.
Building remotely on H24 (ubuntu xenial) in workspace 

No credentials specified
Wiping out workspace first.
Cloning the remote Git repository
Using shallow clone
Cloning repository https://github.com/apache/incubator-hudi.git
 > git init  # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-hudi.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/incubator-hudi.git 
 > +refs/heads/*:refs/remotes/origin/* --depth=1
 > git config remote.origin.url https://github.com/apache/incubator-hudi.git # 
 > timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/incubator-hudi.git # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-hudi.git
 > git fetch --tags --progress https://github.com/apache/incubator-hudi.git 
 > +refs/heads/*:refs/remotes/origin/* --depth=1
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 41dbac69037ddd770a94cf41f39beff92aec9568 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 41dbac69037ddd770a94cf41f39beff92aec9568
Commit message: "Fixed unit test"
 > git rev-list --no-walk e0ab89b3ac22207ff45cf3cae782d64b8be01bf1 # timeout=10
First time build. Skipping changelog.
Setting MAVEN_3_5_4_HOME=/home/jenkins/tools/maven/apache-maven-3.5.4
[hudi-snapshot-deployment-0.5] $ /bin/bash /tmp/jenkins1303960958879600239.sh

/tmp/jenkins1303960958879600239.sh: line 15: mvn: command not found
/tmp/jenkins1303960958879600239.sh: line 16: mvn: command not found
Detected current version as: 
'HUDI_home=
Build step 'Execute shell' marked build as failure
Setting MAVEN_3_5_4_HOME=/home/jenkins/tools/maven/apache-maven-3.5.4


svn commit: r35439 - in /release/hudi: ./ KEYS

2019-08-28 Thread vbalaji
Author: vbalaji
Date: Wed Aug 28 17:00:56 2019
New Revision: 35439

Log:
Adding hudi KEYS 

Added:
release/hudi/
release/hudi/KEYS

Added: release/hudi/KEYS
==
--- release/hudi/KEYS (added)
+++ release/hudi/KEYS Wed Aug 28 17:00:56 2019
@@ -0,0 +1,275 @@
+This file contains the PGP keys of HUDI developers.
+
+Users: pgp < KEYS
+   gpg --import KEYS
+Developers:
+pgp -kxa  and append it to this file.
+(pgpk -ll  && pgpk -xa ) >> this file.
+(gpg --list-sigs 
+ && gpg --armor --export ) >> this file.
+
+pub   4096R/D3541808 2014-01-09
+uid   [ultimate] Suneel Marthi (CODE SIGNING KEY) 
+sig 3D3541808 2014-01-09  Suneel Marthi (CODE SIGNING KEY) 

+sub   4096R/AF46E2DE 2014-01-09
+sig  D3541808 2014-01-09  Suneel Marthi (CODE SIGNING KEY) 

+
+-BEGIN PGP PUBLIC KEY BLOCK-
+Comment: GPGTools - https://gpgtools.org
+
+mQINBFLPJmEBEAC9d/dUZCXeyhB0fVGmJAjdjXfLebav4VqGdNZC+M1T9C3dcVsh
+X/JGme5bjJeIgVwiH5UsdNceYn1+hyxs8jXuRAWEWKP76gD+pNrp8Az0ZdBkJoAy
+zCywOPtJV2PCOz7+S5ri2nUA2+1Kgcu6IlSLMmYAGO0IAmRrjBEzxy9iGaxiNGTc
+LvQt/iVtIXWkKKI8yvpoJ8iFf3TGhpjgaC/h7cJP3zpy0SScmhJJASLXRsfocLv9
+sle6ndN9IPbDtRW8cL7Fk3VQlzp1ToVjmnQTyZZ6S1WafsjzCZ9hLN+k++o8VbvY
+v3icY6Sy0BKz0J6KwaxTkuZ6w1K7oUkVOQboKaWFIEdO+jwrEmU+Puyd8Np8jLnF
+Q0Y5GPfyMlqM3S/zaDm1t4D1eb5FLciStkxfg5wPVK6TkqB325KVD3aio5C7E7kt
+aQechHxaJXCQOtCtVY4X+L4iClnMSuk+hcSc8W8MYRTSVansItK0vI9eQZXMnpan
+w9/jk5rS4Gts1rHB7+kdjT3QRJmkyk6fEFT0fz5tfMC7N8waeEUhCaRW6lAoiqDW
+NW1h+0UGxJw+9YcGxBC0kkt3iofNOWQWmuf/BS3DHPKT7XV/YtBHe44wW0sF5L5P
+nfQUHpnA3pcZ0En6bXAvepKVZTNdOWWJqMyHV+436DA+33h45QL6lWb/GwARAQAB
+tDVTdW5lZWwgTWFydGhpIChDT0RFIFNJR05JTkcgS0VZKSA8c21hcnRoaUBhcGFj
+aGUub3JnPokCNwQTAQoAIQUCUs8mYQIbAwULCQgHAwUVCgkICwUWAgMBAAIeAQIX
+gAAKCRC08czE01QYCOKKEAChRtHBoYNTX+RZbFO0Kl1GlN+i1Ik0shEm5ZJ56XHv
+AnFx/gRK7CfZzJswWo7kf2s/dvJiFfs+rrolYVuO6E8gNhAaTEomSuvWQAMHdPcR
+9G5APRKCSkbZYugElqplEbSphk78FKoFO+sml52M7Pr9jj88ApBjoFVVY8njdnNq
+6DVlaDsg8YninCD78Z7PNFnRGwxyZ8Qd4Dh0rG+MUTfAWopZu6/MxpQxU7QpeVeX
+SIMLg7ClFrGfXnZcszYF4dnav1aa0i7W88PAdYNPko7tC5qz5yv2ep7t2gRbcYKf
+RXhYC2FHQey3wPhMKjA8V436lAqmfYnY/YdmhEy9Xq/1EdX1nHsQ7OEkfgXK14WM
+F+rnqXRAl/0cwiyb41eocdg5kpZFIKgCYT02usLWxwNnd3jOCe109Ze3y3acN/G8
++xOf9YRfNVAe6pD8H6ieRbv9gRjBmsbz9bXQCmxFnDqxNri5Me6gBAQPNmYTJD0h
+jgJTK6o0vJ0pwjBLauasJsLu+1tR3Cb0dxPE+JVaTF26FCd7pM7W6KdVfod9ZfrN
+cSyJ/cECc2KvYVGmTjQNVo1dYG0awBachlWnYNt+0Qx4opLsczZOLtPKtFY4BJA7
+aZoXT4Qf9yB8km7x2/cgNExVbFummToJ/IP3M39/EaryspsQQuM5Qu5Q5lZp8Qnn
+ybkCDQRSzyZhARAA7bAawFzbJaghYnm6mTZyGG5hQmfAynbF6cPAE+g2SnXcNQjP
+6kjYx3tSpb7rEzmjQqs46ztqdec6PIVBMhakON6z27Zz+IviAtO/TcaZHWNuCAjw
+FXVQZ+tYsSeiKInttfkrQc8jXAHWwSkSjLqNpvQpBdBEX80MYkFB6ZPOeON2+/Ta
+GC1H/HU2YngF0qQSmG33KKG6ezihBJdKxU6t2tsQfTlCmZW6R6MGpS9fVurYMKBk
+vR+7RGZ/H6dSjWPcpxhusGg92J9uz7r5SopN1wSdyPMUCMAFGeyoxcAuBDl38quU
+H/ENG3x5LDPq2aEH2AJ6yvZfIXbeJ1zmXf2cAHv+HbmvZaTSp0XIjq8Yxh8NkYEC
+ZdfRWmsGLIpU16TkBijpK3Dn9MDXjHGT3V8/qfdpURtMvIaL8WFrq9ejcy/vGRFn
+mCYqxIIPH+vLiMXKWtuMc61GN3ES21msKQH6IuQxxfQLyhK44L/pv7FpF4E+6LaE
+8uRwAex5HIDpR1v4aJq089rRtye9VXTJJLZ7lYs0HctdZ30QbBRWT4jS9d9rj3cr
+HgQ7mIGO9TAfK2kWc6AJN/EvxPWNbOwptsTUzAF/adiy9ax8C18iw7nKczC+2eN6
+UcbxXiPdytuKYK7O9A8S9e1w89GwpxYN7Xfn2o6QfpSbL9cLKiinOeV+xikAEQEA
+AYkCHwQYAQoACQUCUs8mYQIbDAAKCRC08czE01QYCG7yD/471dmyOD+go8cZkdqR
+3CHhjH03odtI0EJNVy4VGEC0r9paz3BWYTy18LqWYkw3ygphOIU1r8/7QK3H5Ke3
+c4yCSUxaMk5SlAJ+iVRek5TABkR8+zI+ZN5pQtqRH+ya5JxV4F/Sx5Q3KWMzpvgY
+n6AgSSc3hEfkgdI7SalIeyLaLDWv+RFdGZ5JU5gD28C0G8BeH8L62x6sixZcqoGT
+oy9rwkjs45/ZmmvBZhd1wLvC/au8l2Ecou6O8+8m26W8Z7vCuGKxuWn0KV3DLLWe
+66uchDVlakGoMJSPIK06JWYUlE+gL0CW+U2ekt/v2qb8hGgMVET3CBAMq+bFWuJ6
+juX7hJd7wHtCFfjnFDDAkdp2IIIZAlBW6FZGv7pJ82xsW6pSAg0A7VrV6nTtMtDv
+T8esOfo/t4t0gaL7bivy9DVVdATbUBcJJFpoVoe5MxiyjptveqPzIRwzt04n52Ph
+ordVWAnX5AokXWTg+Glem/EWEuf7jUuZArfqCSl/sZoQdXGTjR7G4iFscispji4+
+kNjVQsItqFbgDpuc6n+GcFxlKQ7YMCnu5MVtTV01U4lFs0qy0NTUqsuR35DM4z14
+DkFmj1upWAayCoXTpKzsHBvJZPC+Wqf9Pl3O47apelg7KxU3S011YfXpVPvCTKBv
+kD2o/5GKWS5QkSUEUXXY1oDiLg==
+=f8kJ
+-END PGP PUBLIC KEY BLOCK-
+
+pub   rsa4096 2019-07-29 [SC]
+  AF9BAF79D311A3D3288E583F24A499037262AAA4
+uid   [ultimate] Balaji Varadarajan 
+sig 324A499037262AAA4 2019-07-29  Balaji Varadarajan 

+sub   rsa4096 2019-07-29 [E]
+sig  24A499037262AAA4 2019-07-29  Balaji Varadarajan 

+
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBF0+XtEBEADpNIZkDKZrwrHy7x8uJBSelnMGvd9z6+PYmvWYVvoGnjipjC7L
+fXzaZGofmKxDEKtQI5ip/4DlX/vRVjwNdaPfelLCPN+dZy73m2NcYH2v9OgVNf/L
+L6eqispkqIbmGRwJqq3YfsrDSqlJ5gS9B7/rSUyKx33sKzm0uHT+E/fg45q8AJBn
+ef/Y2zvSu7Stv9wYrXGBrOlwBpiRUoobcF7utAtLcr18DLgRD3K3trWpjLJqFf6O
+LDiFR25VmCQ6Lr/vPKICil75Z91CgRzkHl44drZffzqOzljz62nawSMhxzuX8ryO
+pTG8Wq3U1dS3699iCgMPYeHB4C43c0ieZf/+y7uJD7GwW7Jfnc1GuN3OwiDA16yh
+NfDQhhXlZf+iKAOBhkIGqYgy2+l587etTZqUBKWIjxwVobhX6VHKXDTC7YYxnw8n
+4emuF4nxC5ySfuJBaMFCTBvgALoBPJA4spS+uBFVygM7/ZMR2KUywhajqbpm4iEw

[jira] [Closed] (HUDI-224) Upload KEYS to dist.apache.hudi

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN closed HUDI-224.
---

> Upload KEYS to dist.apache.hudi
> ---
>
> Key: HUDI-224
> URL: https://issues.apache.org/jira/browse/HUDI-224
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: BALAJI VARADARAJAN
>Assignee: BALAJI VARADARAJAN
>Priority: Minor
>
> svn co https://dist.apache.org/repos/dist/dev/ --depth immediates
> cd dev
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 
>  
> svn co https://dist.apache.org/repos/dist/release/ --depth immediates
> cd release
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] smdahmed opened a new issue #859: Hudi upsert after a delete in partition will cause valid records inserted to disappear.

2019-08-28 Thread GitBox
smdahmed opened a new issue #859: Hudi upsert after a delete in partition will 
cause valid records inserted to disappear.
URL: https://github.com/apache/incubator-hudi/issues/859
 
 
   How to replicate this issue:
   
   1. Insert data into certain partition (eg: p1) -> (1, kabeer | 2, vinoth)
   2. Delete record (1, kabeer)
   3. Upsert a new record: (3, balaji)
   
   The data visible in the table is: (3, balaji) whilst the expectation would 
be that: (2, vinoth) | (3, balaji) would be visible. 
   
   From the .hoodie files, what seems to happen is that things seem to work 
fine till step 2. That is after delete, the commit file clearly states that 
there was 1 writes and 1 delete. (so total records in partition nicely tallies 
up to 2) 
   
   When there is an upsert of step 3, the commit is wrong. The commit states 
numWrites to 1. 
   
   Can someone please advise if there is need of further information? And any 
pointers how this could be fixed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


svn commit: r35437 - /dev/hudi/

2019-08-28 Thread vbalaji
Author: vbalaji
Date: Wed Aug 28 16:54:45 2019
New Revision: 35437

Log:
Adding hudi folder

Added:
dev/hudi/



[jira] [Reopened] (HUDI-104) Replace ThreadLocal with Resource pool when caching resource type classes

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN reopened HUDI-104:
-

> Replace ThreadLocal with Resource pool when caching resource type classes
> -
>
> Key: HUDI-104
> URL: https://issues.apache.org/jira/browse/HUDI-104
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Common Core, Performance
>Reporter: BALAJI VARADARAJAN
>Priority: Major
>
> This is another comment that stemmed from Timeline Server code-review.
>  
> Notes from PR comments:
> '''RandomAccessFile is not threadsafe. Hence, we would have to create 
> separate RandomAccessFileObjects for concurrent access. Agree, the current 
> approach of opening a new file per get() call is slow. I have made changes to 
> use ThreadLocal objects. This should reduce the open() calls greatly but for 
> a long running timeline service, there could be open handles getting 
> accumulated per thread. A better approach would be to use Resource Pool but 
> needs to be config tuned and tested. There is another place (ThreadLocal 
> where we might require similar treatment. Will open a ticket to address it. 
> If current perf testing points to the direction of Resource pool, will 
> address it as part of this PR. Otherwise, we can take this in subsequent PR'''



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Jenkins build is back to normal : hudi-snapshot-deployment-0.5 #20

2019-08-28 Thread Apache Jenkins Server
See 




[jira] [Resolved] (HUDI-224) Upload KEYS to dist.apache.hudi

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN resolved HUDI-224.
-
Resolution: Fixed

> Upload KEYS to dist.apache.hudi
> ---
>
> Key: HUDI-224
> URL: https://issues.apache.org/jira/browse/HUDI-224
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: BALAJI VARADARAJAN
>Assignee: BALAJI VARADARAJAN
>Priority: Minor
>
> svn co https://dist.apache.org/repos/dist/dev/ --depth immediates
> cd dev
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 
>  
> svn co https://dist.apache.org/repos/dist/release/ --depth immediates
> cd release
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (HUDI-224) Upload KEYS to dist.apache.hudi

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN reassigned HUDI-224:
---

Assignee: BALAJI VARADARAJAN

> Upload KEYS to dist.apache.hudi
> ---
>
> Key: HUDI-224
> URL: https://issues.apache.org/jira/browse/HUDI-224
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: BALAJI VARADARAJAN
>Assignee: BALAJI VARADARAJAN
>Priority: Minor
>
> svn co https://dist.apache.org/repos/dist/dev/ --depth immediates
> cd dev
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 
>  
> svn co https://dist.apache.org/repos/dist/release/ --depth immediates
> cd release
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HUDI-224) Upload KEYS to dist.apache.hudi

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917931#comment-16917931
 ] 

BALAJI VARADARAJAN commented on HUDI-224:
-

KEYS file added to [https://dist.apache.org/repos/dist/dev/hudi/]

and 
[https://dist.apache.org/repos/dist/release/hudi/|https://dist.apache.org/repos/dist/dev/hudi/]

> Upload KEYS to dist.apache.hudi
> ---
>
> Key: HUDI-224
> URL: https://issues.apache.org/jira/browse/HUDI-224
> Project: Apache Hudi (incubating)
>  Issue Type: Sub-task
>  Components: asf-migration
>Reporter: BALAJI VARADARAJAN
>Assignee: BALAJI VARADARAJAN
>Priority: Minor
>
> svn co https://dist.apache.org/repos/dist/dev/ --depth immediates
> cd dev
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 
>  
> svn co https://dist.apache.org/repos/dist/release/ --depth immediates
> cd release
> mkdir hudi
> <
> svn add hudi
> svn commit -m 'Updating HUDI keys to include <>'  --username 
> 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


svn commit: r35438 - /dev/hudi/KEYS

2019-08-28 Thread vbalaji
Author: vbalaji
Date: Wed Aug 28 16:58:40 2019
New Revision: 35438

Log:
Adding hudi KEYS 

Added:
dev/hudi/KEYS

Added: dev/hudi/KEYS
==
--- dev/hudi/KEYS (added)
+++ dev/hudi/KEYS Wed Aug 28 16:58:40 2019
@@ -0,0 +1,275 @@
+This file contains the PGP keys of various developers.
+
+Users: pgp < KEYS
+   gpg --import KEYS
+Developers:
+pgp -kxa  and append it to this file.
+(pgpk -ll  && pgpk -xa ) >> this file.
+(gpg --list-sigs 
+ && gpg --armor --export ) >> this file.
+
+pub   4096R/D3541808 2014-01-09
+uid   [ultimate] Suneel Marthi (CODE SIGNING KEY) 
+sig 3D3541808 2014-01-09  Suneel Marthi (CODE SIGNING KEY) 

+sub   4096R/AF46E2DE 2014-01-09
+sig  D3541808 2014-01-09  Suneel Marthi (CODE SIGNING KEY) 

+
+-BEGIN PGP PUBLIC KEY BLOCK-
+Comment: GPGTools - https://gpgtools.org
+
+mQINBFLPJmEBEAC9d/dUZCXeyhB0fVGmJAjdjXfLebav4VqGdNZC+M1T9C3dcVsh
+X/JGme5bjJeIgVwiH5UsdNceYn1+hyxs8jXuRAWEWKP76gD+pNrp8Az0ZdBkJoAy
+zCywOPtJV2PCOz7+S5ri2nUA2+1Kgcu6IlSLMmYAGO0IAmRrjBEzxy9iGaxiNGTc
+LvQt/iVtIXWkKKI8yvpoJ8iFf3TGhpjgaC/h7cJP3zpy0SScmhJJASLXRsfocLv9
+sle6ndN9IPbDtRW8cL7Fk3VQlzp1ToVjmnQTyZZ6S1WafsjzCZ9hLN+k++o8VbvY
+v3icY6Sy0BKz0J6KwaxTkuZ6w1K7oUkVOQboKaWFIEdO+jwrEmU+Puyd8Np8jLnF
+Q0Y5GPfyMlqM3S/zaDm1t4D1eb5FLciStkxfg5wPVK6TkqB325KVD3aio5C7E7kt
+aQechHxaJXCQOtCtVY4X+L4iClnMSuk+hcSc8W8MYRTSVansItK0vI9eQZXMnpan
+w9/jk5rS4Gts1rHB7+kdjT3QRJmkyk6fEFT0fz5tfMC7N8waeEUhCaRW6lAoiqDW
+NW1h+0UGxJw+9YcGxBC0kkt3iofNOWQWmuf/BS3DHPKT7XV/YtBHe44wW0sF5L5P
+nfQUHpnA3pcZ0En6bXAvepKVZTNdOWWJqMyHV+436DA+33h45QL6lWb/GwARAQAB
+tDVTdW5lZWwgTWFydGhpIChDT0RFIFNJR05JTkcgS0VZKSA8c21hcnRoaUBhcGFj
+aGUub3JnPokCNwQTAQoAIQUCUs8mYQIbAwULCQgHAwUVCgkICwUWAgMBAAIeAQIX
+gAAKCRC08czE01QYCOKKEAChRtHBoYNTX+RZbFO0Kl1GlN+i1Ik0shEm5ZJ56XHv
+AnFx/gRK7CfZzJswWo7kf2s/dvJiFfs+rrolYVuO6E8gNhAaTEomSuvWQAMHdPcR
+9G5APRKCSkbZYugElqplEbSphk78FKoFO+sml52M7Pr9jj88ApBjoFVVY8njdnNq
+6DVlaDsg8YninCD78Z7PNFnRGwxyZ8Qd4Dh0rG+MUTfAWopZu6/MxpQxU7QpeVeX
+SIMLg7ClFrGfXnZcszYF4dnav1aa0i7W88PAdYNPko7tC5qz5yv2ep7t2gRbcYKf
+RXhYC2FHQey3wPhMKjA8V436lAqmfYnY/YdmhEy9Xq/1EdX1nHsQ7OEkfgXK14WM
+F+rnqXRAl/0cwiyb41eocdg5kpZFIKgCYT02usLWxwNnd3jOCe109Ze3y3acN/G8
++xOf9YRfNVAe6pD8H6ieRbv9gRjBmsbz9bXQCmxFnDqxNri5Me6gBAQPNmYTJD0h
+jgJTK6o0vJ0pwjBLauasJsLu+1tR3Cb0dxPE+JVaTF26FCd7pM7W6KdVfod9ZfrN
+cSyJ/cECc2KvYVGmTjQNVo1dYG0awBachlWnYNt+0Qx4opLsczZOLtPKtFY4BJA7
+aZoXT4Qf9yB8km7x2/cgNExVbFummToJ/IP3M39/EaryspsQQuM5Qu5Q5lZp8Qnn
+ybkCDQRSzyZhARAA7bAawFzbJaghYnm6mTZyGG5hQmfAynbF6cPAE+g2SnXcNQjP
+6kjYx3tSpb7rEzmjQqs46ztqdec6PIVBMhakON6z27Zz+IviAtO/TcaZHWNuCAjw
+FXVQZ+tYsSeiKInttfkrQc8jXAHWwSkSjLqNpvQpBdBEX80MYkFB6ZPOeON2+/Ta
+GC1H/HU2YngF0qQSmG33KKG6ezihBJdKxU6t2tsQfTlCmZW6R6MGpS9fVurYMKBk
+vR+7RGZ/H6dSjWPcpxhusGg92J9uz7r5SopN1wSdyPMUCMAFGeyoxcAuBDl38quU
+H/ENG3x5LDPq2aEH2AJ6yvZfIXbeJ1zmXf2cAHv+HbmvZaTSp0XIjq8Yxh8NkYEC
+ZdfRWmsGLIpU16TkBijpK3Dn9MDXjHGT3V8/qfdpURtMvIaL8WFrq9ejcy/vGRFn
+mCYqxIIPH+vLiMXKWtuMc61GN3ES21msKQH6IuQxxfQLyhK44L/pv7FpF4E+6LaE
+8uRwAex5HIDpR1v4aJq089rRtye9VXTJJLZ7lYs0HctdZ30QbBRWT4jS9d9rj3cr
+HgQ7mIGO9TAfK2kWc6AJN/EvxPWNbOwptsTUzAF/adiy9ax8C18iw7nKczC+2eN6
+UcbxXiPdytuKYK7O9A8S9e1w89GwpxYN7Xfn2o6QfpSbL9cLKiinOeV+xikAEQEA
+AYkCHwQYAQoACQUCUs8mYQIbDAAKCRC08czE01QYCG7yD/471dmyOD+go8cZkdqR
+3CHhjH03odtI0EJNVy4VGEC0r9paz3BWYTy18LqWYkw3ygphOIU1r8/7QK3H5Ke3
+c4yCSUxaMk5SlAJ+iVRek5TABkR8+zI+ZN5pQtqRH+ya5JxV4F/Sx5Q3KWMzpvgY
+n6AgSSc3hEfkgdI7SalIeyLaLDWv+RFdGZ5JU5gD28C0G8BeH8L62x6sixZcqoGT
+oy9rwkjs45/ZmmvBZhd1wLvC/au8l2Ecou6O8+8m26W8Z7vCuGKxuWn0KV3DLLWe
+66uchDVlakGoMJSPIK06JWYUlE+gL0CW+U2ekt/v2qb8hGgMVET3CBAMq+bFWuJ6
+juX7hJd7wHtCFfjnFDDAkdp2IIIZAlBW6FZGv7pJ82xsW6pSAg0A7VrV6nTtMtDv
+T8esOfo/t4t0gaL7bivy9DVVdATbUBcJJFpoVoe5MxiyjptveqPzIRwzt04n52Ph
+ordVWAnX5AokXWTg+Glem/EWEuf7jUuZArfqCSl/sZoQdXGTjR7G4iFscispji4+
+kNjVQsItqFbgDpuc6n+GcFxlKQ7YMCnu5MVtTV01U4lFs0qy0NTUqsuR35DM4z14
+DkFmj1upWAayCoXTpKzsHBvJZPC+Wqf9Pl3O47apelg7KxU3S011YfXpVPvCTKBv
+kD2o/5GKWS5QkSUEUXXY1oDiLg==
+=f8kJ
+-END PGP PUBLIC KEY BLOCK-
+
+pub   rsa4096 2019-07-29 [SC]
+  AF9BAF79D311A3D3288E583F24A499037262AAA4
+uid   [ultimate] Balaji Varadarajan 
+sig 324A499037262AAA4 2019-07-29  Balaji Varadarajan 

+sub   rsa4096 2019-07-29 [E]
+sig  24A499037262AAA4 2019-07-29  Balaji Varadarajan 

+
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBF0+XtEBEADpNIZkDKZrwrHy7x8uJBSelnMGvd9z6+PYmvWYVvoGnjipjC7L
+fXzaZGofmKxDEKtQI5ip/4DlX/vRVjwNdaPfelLCPN+dZy73m2NcYH2v9OgVNf/L
+L6eqispkqIbmGRwJqq3YfsrDSqlJ5gS9B7/rSUyKx33sKzm0uHT+E/fg45q8AJBn
+ef/Y2zvSu7Stv9wYrXGBrOlwBpiRUoobcF7utAtLcr18DLgRD3K3trWpjLJqFf6O
+LDiFR25VmCQ6Lr/vPKICil75Z91CgRzkHl44drZffzqOzljz62nawSMhxzuX8ryO
+pTG8Wq3U1dS3699iCgMPYeHB4C43c0ieZf/+y7uJD7GwW7Jfnc1GuN3OwiDA16yh
+NfDQhhXlZf+iKAOBhkIGqYgy2+l587etTZqUBKWIjxwVobhX6VHKXDTC7YYxnw8n
+4emuF4nxC5ySfuJBaMFCTBvgALoBPJA4spS+uBFVygM7/ZMR2KUywhajqbpm4iEw
+ReBHF+MTJE3ZdEF8PsI7zbV68BaYCe6muVxgwuhnZ9ek9mcdIsMJLeIQncMiQT5U

[GitHub] [incubator-hudi] bvaradar commented on issue #859: Hudi upsert after a delete in partition will cause valid records inserted to disappear.

2019-08-28 Thread GitBox
bvaradar commented on issue #859: Hudi upsert after a delete in partition will 
cause valid records inserted to disappear.
URL: https://github.com/apache/incubator-hudi/issues/859#issuecomment-525855438
 
 
   @smdahmed : 
   I am assuming each step is a separate hoodie commit.  Is it possible for you 
to list the partition and .hoodie folder after finishing each operation. Also, 
Can you also copy the contents of each metadata (.commit/.deltacommit) 
corresponding to each operation and attach it.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-226) Hudi Website - Provide links to documentation corresponding to older release versions

2019-08-28 Thread BALAJI VARADARAJAN (Jira)
BALAJI VARADARAJAN created HUDI-226:
---

 Summary: Hudi Website - Provide links to documentation 
corresponding to older release versions
 Key: HUDI-226
 URL: https://issues.apache.org/jira/browse/HUDI-226
 Project: Apache Hudi (incubating)
  Issue Type: New Feature
  Components: Docs, docs-chinese, newbie
Reporter: BALAJI VARADARAJAN
Assignee: vinoyang


While this may be too difficult to do it retroactively for previous versions, 
we need to support this for apache releases. 

See flink website (e:g - [https://flink.apache.org/] you will see a link 1.9 
version  [https://ci.apache.org/projects/flink/flink-docs-release-1.9/]

For older releases, 0.4.6 and 0.4.7, we have created git tags 
*hoodie-site-0.4.6 and*  *hoodie-site-0.4.7* 

*You can checkout the tags and read README.md to access and run website 
locally.*

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HUDI-225) Create Hudi Timeline Server Fat Jar

2019-08-28 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918153#comment-16918153
 ] 

leesf commented on HUDI-225:


Will work on this ticket.

> Create Hudi Timeline Server Fat Jar 
> 
>
> Key: HUDI-225
> URL: https://issues.apache.org/jira/browse/HUDI-225
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Deployment, newbie
>Reporter: BALAJI VARADARAJAN
>Assignee: leesf
>Priority: Minor
>
> We need to add maven module under packaging named hudi-timeline-server-bundle 
> to bundle timeline service 
>  In the pom, add the following shading configurations
> 
>  org.apache.maven.plugins
>  maven-shade-plugin
>  2.4
>  
>  true
>  
>  
>  *:*
>  
>  META-INF/*.SF
>  META-INF/*.DSA
>  META-INF/*.RSA
>  
>  
>  
>  
>  
>  
>  package
>  
>  shade
>  
>  
>  
>   implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
>  />
>   implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
>  org.apache.hudi.timeline.service.TimelineService
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HUDI-225) Create Hudi Timeline Server Fat Jar

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918156#comment-16918156
 ] 

BALAJI VARADARAJAN commented on HUDI-225:
-

Thanks [~xleesf]

> Create Hudi Timeline Server Fat Jar 
> 
>
> Key: HUDI-225
> URL: https://issues.apache.org/jira/browse/HUDI-225
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Deployment, newbie
>Reporter: BALAJI VARADARAJAN
>Assignee: leesf
>Priority: Minor
>
> We need to add maven module under packaging named hudi-timeline-server-bundle 
> to bundle timeline service 
>  In the pom, add the following shading configurations
> 
>  org.apache.maven.plugins
>  maven-shade-plugin
>  2.4
>  
>  true
>  
>  
>  *:*
>  
>  META-INF/*.SF
>  META-INF/*.DSA
>  META-INF/*.RSA
>  
>  
>  
>  
>  
>  
>  package
>  
>  shade
>  
>  
>  
>   implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
>  />
>   implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
>  org.apache.hudi.timeline.service.TimelineService
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] thesuperzapper edited a comment on issue #638: [HUDI-12] Upgrade to Spark 2.4, Avro 1.8.2, Parquet 1.10.0...

2019-08-28 Thread GitBox
thesuperzapper edited a comment on issue #638: [HUDI-12] Upgrade to Spark 2.4, 
Avro 1.8.2, Parquet 1.10.0... 
URL: https://github.com/apache/incubator-hudi/pull/638#issuecomment-525954787
 
 
   #846, and #850 have made most of this PR redundant.
   
   @vinothchandar @bvaradar do we still want to bump to Spark 2.4 minimum (+ 
associated avro/parquet)?
   
   **I can see a few reasons why we still might want to do this:**
   * To use spark-avro from core spark, instead of databricks (Introduced in 
Spark 2.4)
 * This also lets us move to AVRO 1.8.2, which introduced support for 
[logical types](https://avro.apache.org/docs/1.8.2/spec.html#Logical+Types).
   * To use the Datasource V2 API (Introduced in Spark 2.3)
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar merged pull request #858: HUDI-124 : Exclude jdk.tools from hadoop-common and update Notice files

2019-08-28 Thread GitBox
vinothchandar merged pull request #858: HUDI-124 : Exclude jdk.tools from 
hadoop-common and update Notice files
URL: https://github.com/apache/incubator-hudi/pull/858
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] branch master updated: HUDI-124 : Exclude jdk.tools from hadoop-common and update Notice files (#858)

2019-08-28 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 5f9fa82  HUDI-124 : Exclude jdk.tools from hadoop-common and update 
Notice files (#858)
5f9fa82 is described below

commit 5f9fa82f47e1cc14a22b869250fe23c8f9c033cd
Author: Balaji Varadarajan 
AuthorDate: Wed Aug 28 16:20:47 2019 -0700

HUDI-124 : Exclude jdk.tools from hadoop-common and update Notice files 
(#858)
---
 NOTICE.txt | 82 +-
 docker/hoodie/hadoop/NOTICE.txt| 47 ++---
 docker/hoodie/hadoop/base/NOTICE.txt   | 29 +++-
 docker/hoodie/hadoop/datanode/NOTICE.txt   | 30 +++-
 docker/hoodie/hadoop/historyserver/NOTICE.txt  | 30 +++-
 docker/hoodie/hadoop/hive_base/NOTICE.txt  | 30 +++-
 docker/hoodie/hadoop/namenode/NOTICE.txt   | 30 +++-
 docker/hoodie/hadoop/spark_base/NOTICE.txt | 31 +++-
 docker/hoodie/hadoop/sparkadhoc/NOTICE.txt | 32 +++--
 docker/hoodie/hadoop/sparkmaster/NOTICE.txt| 32 +++--
 docker/hoodie/hadoop/sparkworker/NOTICE.txt| 32 +++--
 hudi-cli/src/main/resources/META-INF/NOTICE.txt| 35 +++--
 hudi-client/src/main/resources/META-INF/NOTICE.txt | 23 ++
 hudi-common/src/main/resources/META-INF/NOTICE.txt |  9 +--
 .../src/main/resources/META-INF/NOTICE.txt | 17 +
 hudi-hive/src/main/resources/META-INF/NOTICE.txt   | 18 ++---
 .../src/main/resources/META-INF/NOTICE.txt | 44 
 hudi-spark/src/main/resources/META-INF/NOTICE.txt  | 26 +++
 .../src/main/resources/META-INF/NOTICE.txt |  7 +-
 .../src/main/resources/META-INF/NOTICE.txt | 27 +++
 .../src/main/resources/META-INF/HUDI_NOTICE.txt| 19 ++---
 .../src/main/resources/META-INF/HUDI_NOTICE.txt| 22 ++
 .../src/main/resources/META-INF/HUDI_NOTICE.txt| 19 ++---
 .../src/main/resources/META-INF/HUDI_NOTICE.txt| 26 +++
 .../src/main/resources/META-INF/HUDI_NOTICE.txt| 29 +++-
 pom.xml| 10 +++
 26 files changed, 251 insertions(+), 485 deletions(-)

diff --git a/NOTICE.txt b/NOTICE.txt
index 1f5c21d..3246581 100644
--- a/NOTICE.txt
+++ b/NOTICE.txt
@@ -26,13 +26,11 @@ This project includes:
   Apache Calcite Avatica under Apache License, Version 2.0
   Apache Calcite Avatica Metrics under Apache License, Version 2.0
   Apache Commons Collections under Apache License, Version 2.0
-  Apache Commons Compress under Apache License, Version 2.0
-  Apache Commons Configuration under Apache License, Version 2.0
+  Apache Commons Compress under The Apache Software License, Version 2.0
   Apache Commons Crypto under Apache License, Version 2.0
   Apache Commons IO under Apache License, Version 2.0
   Apache Commons Lang under Apache License, Version 2.0
   Apache Commons Logging under The Apache Software License, Version 2.0
-  Apache Commons Math under The Apache Software License, Version 2.0
   Apache Curator under The Apache Software License, Version 2.0
   Apache Derby Database Engine and Embedded JDBC Driver under Apache 2
   Apache Directory API ASN.1 API under The Apache Software License, Version 2.0
@@ -55,7 +53,6 @@ This project includes:
   Apache HBase - Server under Apache License, Version 2.0
   Apache HBase - Testing Util under Apache License, Version 2.0
   Apache HttpClient under Apache License, Version 2.0
-  Apache HttpClient Fluent API under Apache License, Version 2.0
   Apache HttpCore under Apache License, Version 2.0
   Apache Ivy under The Apache Software License, Version 2.0
   Apache Kafka under The Apache Software License, Version 2.0
@@ -66,21 +63,13 @@ This project includes:
   Apache Log4j SLF4J Binding under The Apache Software License, Version 2.0
   Apache Log4j Web under The Apache Software License, Version 2.0
   Apache Parquet Avro under The Apache Software License, Version 2.0
-  Apache Parquet Avro (Incubating) under The Apache Software License, Version 
2.0
   Apache Parquet Column under The Apache Software License, Version 2.0
-  Apache Parquet Column (Incubating) under The Apache Software License, 
Version 2.0
   Apache Parquet Common under The Apache Software License, Version 2.0
-  Apache Parquet Common (Incubating) under The Apache Software License, 
Version 2.0
   Apache Parquet Encodings under The Apache Software License, Version 2.0
-  Apache Parquet Encodings (Incubating) under The Apache Software License, 
Version 2.0
   Apache Parquet Format (Incubating) under The Apache Software License, 
Version 2.0
-  Apache Parquet Generator (Incubating) under The Apache Software License, 
Version 2.0
   Apache Parquet Hadoop under The Apache Software License, Version 2.0
-  Apache Parquet Hadoop (Incubating) under The 

[incubator-hudi] tag hoodie-site-0.4.6 created (now 5976398)

2019-08-28 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to tag hoodie-site-0.4.6
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


  at 5976398  (commit)
No new revisions were added by this update.



[GitHub] [incubator-hudi] afilipchik commented on issue #638: [HUDI-12] Upgrade to Spark 2.4, Avro 1.8.2, Parquet 1.10.0...

2019-08-28 Thread GitBox
afilipchik commented on issue #638: [HUDI-12] Upgrade to Spark 2.4, Avro 1.8.2, 
Parquet 1.10.0... 
URL: https://github.com/apache/incubator-hudi/pull/638#issuecomment-525947387
 
 
   hey, are there any plans to finish this one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (HUDI-225) Create Hudi Timeline Server Fat Jar

2019-08-28 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf reassigned HUDI-225:
--

Assignee: leesf

> Create Hudi Timeline Server Fat Jar 
> 
>
> Key: HUDI-225
> URL: https://issues.apache.org/jira/browse/HUDI-225
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Deployment, newbie
>Reporter: BALAJI VARADARAJAN
>Assignee: leesf
>Priority: Minor
>
> We need to add maven module under packaging named hudi-timeline-server-bundle 
> to bundle timeline service 
>  In the pom, add the following shading configurations
> 
>  org.apache.maven.plugins
>  maven-shade-plugin
>  2.4
>  
>  true
>  
>  
>  *:*
>  
>  META-INF/*.SF
>  META-INF/*.DSA
>  META-INF/*.RSA
>  
>  
>  
>  
>  
>  
>  package
>  
>  shade
>  
>  
>  
>   implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
>  />
>   implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
>  org.apache.hudi.timeline.service.TimelineService
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (HUDI-222) Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread leesf (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

leesf closed HUDI-222.
--
Fix Version/s: 0.6.0
   Resolution: Fixed

> Rename main class path to org.apache.hudi.timeline.service.TimelineService in 
> run_server.sh
> ---
>
> Key: HUDI-222
> URL: https://issues.apache.org/jira/browse/HUDI-222
> Project: Apache Hudi (incubating)
>  Issue Type: Bug
>Reporter: leesf
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The current main class path in run_server.sh is
> {code:java}
> com.uber.hoodie.timeline.service.TimelineService
> {code}
> , however, it should changed be
> {code:java}
>  org.apache.hudi.timeline.service.TimelineService{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] thesuperzapper commented on issue #638: [HUDI-12] Upgrade to Spark 2.4, Avro 1.8.2, Parquet 1.10.0...

2019-08-28 Thread GitBox
thesuperzapper commented on issue #638: [HUDI-12] Upgrade to Spark 2.4, Avro 
1.8.2, Parquet 1.10.0... 
URL: https://github.com/apache/incubator-hudi/pull/638#issuecomment-525954787
 
 
   #846, and #851 have made most of this PR redundant.
   
   @vinothchandar @bvaradar do we still want to bump to Spark 2.4 minimum (+ 
associated avro/parquet)?
   
   **I can see a few reasons why we still might want to do this:**
   * To use spark-avro from core spark, instead of databricks (Introduced in 
Spark 2.4)
 * This also lets us move to AVRO 1.8.2, which introduced support for 
[logical types](https://avro.apache.org/docs/1.8.2/spec.html#Logical+Types).
   * To use the Datasource V2 API (Introduced in Spark 2.3)
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (HUDI-225) Create Hudi Timeline Server Fat bundle

2019-08-28 Thread BALAJI VARADARAJAN (Jira)
BALAJI VARADARAJAN created HUDI-225:
---

 Summary: Create Hudi Timeline Server Fat bundle 
 Key: HUDI-225
 URL: https://issues.apache.org/jira/browse/HUDI-225
 Project: Apache Hudi (incubating)
  Issue Type: Task
  Components: Deployment
Reporter: BALAJI VARADARAJAN


We need to add maven module under packaging named hudi-timeline-server-bundle 
to bundle timeline service 

 In the pom, add the following shading configurations


 org.apache.maven.plugins
 maven-shade-plugin
 2.4
 
 true
 
 
 *:*
 
 META-INF/*.SF
 META-INF/*.DSA
 META-INF/*.RSA
 
 
 
 
 
 
 package
 
 shade
 
 
 
 
 
 org.apache.hudi.timeline.service.TimelineService
 
 
 
 
 
 
 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HUDI-225) Create Hudi Timeline Server Fat bundle

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN updated HUDI-225:

Component/s: newbie

> Create Hudi Timeline Server Fat bundle 
> ---
>
> Key: HUDI-225
> URL: https://issues.apache.org/jira/browse/HUDI-225
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Deployment, newbie
>Reporter: BALAJI VARADARAJAN
>Priority: Minor
>
> We need to add maven module under packaging named hudi-timeline-server-bundle 
> to bundle timeline service 
>  In the pom, add the following shading configurations
> 
>  org.apache.maven.plugins
>  maven-shade-plugin
>  2.4
>  
>  true
>  
>  
>  *:*
>  
>  META-INF/*.SF
>  META-INF/*.DSA
>  META-INF/*.RSA
>  
>  
>  
>  
>  
>  
>  package
>  
>  shade
>  
>  
>  
>   implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
>  />
>   implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
>  org.apache.hudi.timeline.service.TimelineService
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[incubator-hudi] 02/02: [hotfix] change hoodie-timeline-*.jar to hudi-timeline-*.jar

2019-08-28 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit 00cfe72c5d56dbaa2e170a087cf04b1e56a87bc3
Author: leesf <490081...@qq.com>
AuthorDate: Wed Aug 28 16:42:18 2019 +0800

[hotfix] change hoodie-timeline-*.jar to hudi-timeline-*.jar
---
 hudi-timeline-service/run_server.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hudi-timeline-service/run_server.sh 
b/hudi-timeline-service/run_server.sh
index 81fe529..24ea7dc 100755
--- a/hudi-timeline-service/run_server.sh
+++ b/hudi-timeline-service/run_server.sh
@@ -2,7 +2,7 @@
 
 DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
 #Ensure we pick the right jar even for hive11 builds
-HOODIE_JAR=`ls -c $DIR/target/hoodie-timeline-*.jar | grep -v test | head -1`
+HOODIE_JAR=`ls -c $DIR/target/hudi-timeline-*.jar | grep -v test | head -1`
 
 if [ -z "$HADOOP_CONF_DIR" ]; then
   echo "setting hadoop conf dir"



[incubator-hudi] 01/02: [HUDI-222] Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git

commit b44f8521f2b25928bccce9a6ba5c89fb76e5330e
Author: leesf <490081...@qq.com>
AuthorDate: Wed Aug 28 16:14:15 2019 +0800

[HUDI-222] Rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh
---
 hudi-timeline-service/run_server.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hudi-timeline-service/run_server.sh 
b/hudi-timeline-service/run_server.sh
index f1bb245..81fe529 100755
--- a/hudi-timeline-service/run_server.sh
+++ b/hudi-timeline-service/run_server.sh
@@ -10,5 +10,5 @@ if [ -z "$HADOOP_CONF_DIR" ]; then
 fi
 
 OTHER_JARS=`ls -1 $DIR/target/lib/*jar | grep -v '*avro*-1.' | tr '\n' ':'`
-echo "Running command : java -cp 
$DIR/target/:${HADOOP_CONF_DIR}:$HOODIE_JAR:$OTHER_JARS 
com.uber.hoodie.timeline.service.TimelineService $@"
-java -Xmx4G -cp 
$DIR/target/test-classes/:${HADOOP_CONF_DIR}:$HOODIE_JAR:$OTHER_JARS 
com.uber.hoodie.timeline.service.TimelineService "$@"
+echo "Running command : java -cp 
$DIR/target/:${HADOOP_CONF_DIR}:$HOODIE_JAR:$OTHER_JARS 
org.apache.hudi.timeline.service.TimelineService $@"
+java -Xmx4G -cp 
$DIR/target/test-classes/:${HADOOP_CONF_DIR}:$HOODIE_JAR:$OTHER_JARS 
org.apache.hudi.timeline.service.TimelineService "$@"



[incubator-hudi] branch master updated (41dbac6 -> 00cfe72)

2019-08-28 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from 41dbac6  Fixed unit test
 new b44f852  [HUDI-222] Rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh
 new 00cfe72  [hotfix] change hoodie-timeline-*.jar to hudi-timeline-*.jar

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 hudi-timeline-service/run_server.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)



[GitHub] [incubator-hudi] DavisBroda opened a new issue #860: Unit tests fail when run on IBM SDK

2019-08-28 Thread GitBox
DavisBroda opened a new issue #860: Unit tests fail when run on IBM SDK
URL: https://github.com/apache/incubator-hudi/issues/860
 
 
   
   When running tests on hudi-incubator using the IBM SDK as a JVM, there are 
several failures, which do not occur when running with openjdk. 
   
   IBM SDK info
   JAVA_VERSION="1.8.0_201"
   OS_NAME="Linux"
   OS_VERSION="2.6"
   OS_ARCH="i586"
   SOURCE=""
   
   IBM sdk can be found 
[here](https://developer.ibm.com/javasdk/downloads/sdk8/)
   I tested this with the Linux on Intel version.
   
   
   Errors noticed:
   
   In hudi-common
   
   HoodieLogFormatTest.java
 - testAvroLogRecordReaderWithRollbackPartialBlock
 - testAvroLogRecordReaderWithRollbackTombstone
 - testAvroLogRecordReaderWithInvalidRollback
 - testBasicAppendAndScanMultipleFiles
 - testAvroLogRecordReaderBasic
 - testAvroLogRecordReaderWithDeleteAndRollback
 
   TestDiskBasedMap.java
 - tstSizeEstimator
 
   TestExternalSpillableMap.java
 - simpleInsertTest
 - simpleTestWithException
 - testAllMapOperations
 - testDataCorrectnessWithUpsertsToDatainMapAndOnDisk
 - testDataCorrectnessWithoutHoodieMetadata
 - testSimpleUpsert
 
   All of the above fail with the below exception  
   ```
   java.lang.NoClassDefFoundError: 
org.apache.hudi.common.util.ObjectSizeCalculator$CurrentLayout (initialization 
failure)
   
at 
java.lang.J9VMInternals.initializationAlreadyFailed(J9VMInternals.java:96)
at 
org.apache.hudi.common.util.ObjectSizeCalculator.getObjectSize(ObjectSizeCalculator.java:138)
at 
org.apache.hudi.common.util.DefaultSizeEstimator.sizeEstimate(DefaultSizeEstimator.java:29)
at 
org.apache.hudi.common.util.collection.ExternalSpillableMap.put(ExternalSpillableMap.java:173)
at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.processNextRecord(HoodieMergedLogRecordScanner.java:117)
at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processAvroDataBlock(AbstractHoodieLogRecordScanner.java:280)
at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:310)
at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:240)
at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:78)
at 
org.apache.hudi.common.table.log.HoodieLogFormatTest.testAvroLogRecordReaderWithInvalidRollback(HoodieLogFormatTest.java:981)
...
   Caused by: java.lang.UnsupportedOperationException: ObjectSizeCalculator 
only supported on HotSpot VM
at 
org.apache.hudi.common.util.ObjectSizeCalculator.getEffectiveMemoryLayoutSpecification(ObjectSizeCalculator.java:372)
at 
org.apache.hudi.common.util.ObjectSizeCalculator$CurrentLayout.(ObjectSizeCalculator.java:118)
at 
org.apache.hudi.common.util.ObjectSizeCalculator.getObjectSize(ObjectSizeCalculator.java:138)
at 
org.apache.hudi.common.util.DefaultSizeEstimator.sizeEstimate(DefaultSizeEstimator.java:29)
at 
org.apache.hudi.common.util.collection.ExternalSpillableMap.put(ExternalSpillableMap.java:173)
at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.processNextRecord(HoodieMergedLogRecordScanner.java:117)
at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processAvroDataBlock(AbstractHoodieLogRecordScanner.java:280)
at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.processQueuedBlocksForInstant(AbstractHoodieLogRecordScanner.java:310)
at 
org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner.scan(AbstractHoodieLogRecordScanner.java:240)
at 
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:78)
at 
org.apache.hudi.common.table.log.HoodieLogFormatTest.testAvroLogRecordReaderWithRollbackPartialBlock(HoodieLogFormatTest.java:742)
... 42 more
   ```
   
   In hoodie-spark
   
   DataSourceTest
 - testMergeOnReadStorage
 - testDropInsertDup
 - testCopyOnWriteStorage
 
 
   Which fail with errors like below
   ```
   java.lang.RuntimeException: org.apache.hudi.exception.HoodieException: 
org.apache.hudi.exception.HoodieException: 
java.util.concurrent.ExecutionException: java.lang.AssertionError
at 
org.apache.hudi.func.LazyIterableIterator.next(LazyIterableIterator.java:123)
at 
scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at 
org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:363)
at 
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:973)

[jira] [Updated] (HUDI-225) Create Hudi Timeline Server Fat Jar

2019-08-28 Thread BALAJI VARADARAJAN (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BALAJI VARADARAJAN updated HUDI-225:

Summary: Create Hudi Timeline Server Fat Jar   (was: Create Hudi Timeline 
Server Fat bundle )

> Create Hudi Timeline Server Fat Jar 
> 
>
> Key: HUDI-225
> URL: https://issues.apache.org/jira/browse/HUDI-225
> Project: Apache Hudi (incubating)
>  Issue Type: Task
>  Components: Deployment, newbie
>Reporter: BALAJI VARADARAJAN
>Priority: Minor
>
> We need to add maven module under packaging named hudi-timeline-server-bundle 
> to bundle timeline service 
>  In the pom, add the following shading configurations
> 
>  org.apache.maven.plugins
>  maven-shade-plugin
>  2.4
>  
>  true
>  
>  
>  *:*
>  
>  META-INF/*.SF
>  META-INF/*.DSA
>  META-INF/*.RSA
>  
>  
>  
>  
>  
>  
>  package
>  
>  shade
>  
>  
>  
>   implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
>  />
>   implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
>  org.apache.hudi.timeline.service.TimelineService
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[GitHub] [incubator-hudi] bvaradar commented on issue #856: [HUDI-222] Rename main class path to org.apache.hudi.timeline.service.TimelineService in run_server.sh

2019-08-28 Thread GitBox
bvaradar commented on issue #856: [HUDI-222] Rename main class path to 
org.apache.hudi.timeline.service.TimelineService in run_server.sh
URL: https://github.com/apache/incubator-hudi/pull/856#issuecomment-525919843
 
 
   @vinothchandar : Added https://jira.apache.org/jira/browse/HUDI-225 , a 
new-bie task for this.  Will go ahead and merge this change. Thanks @leesf for 
doing this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-hudi] tag hoodie-site-0.4.7 created (now d1d74fe)

2019-08-28 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository.

vbalaji pushed a change to tag hoodie-site-0.4.7
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


  at d1d74fe  (commit)
No new revisions were added by this update.



[GitHub] [incubator-hudi] vinothchandar closed pull request #790: Update docker_demo based on new bundling

2019-08-28 Thread GitBox
vinothchandar closed pull request #790:  Update docker_demo based on new 
bundling
URL: https://github.com/apache/incubator-hudi/pull/790
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-hudi] vinothchandar commented on issue #859: Hudi upsert after a delete in partition will cause valid records inserted to disappear.

2019-08-28 Thread GitBox
vinothchandar commented on issue #859: Hudi upsert after a delete in partition 
will cause valid records inserted to disappear.
URL: https://github.com/apache/incubator-hudi/issues/859#issuecomment-526015827
 
 
   I will try to reproduce this and report back. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HUDI-223) Allow for simply using the the Spark Row schema in DeltaSync

2019-08-28 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-223:

Status: Closed  (was: Patch Available)

> Allow for simply using the the Spark Row schema in DeltaSync 
> -
>
> Key: HUDI-223
> URL: https://issues.apache.org/jira/browse/HUDI-223
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: deltastreamer
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)