[GitHub] [hudi] yanghua commented on a change in pull request #3120: [HUDI-2045] Support Read Hoodie As DataSource Table For Flink And Del…

GitBox Wed, 07 Jul 2021 07:28:03 -0700


yanghua commented on a change in pull request #3120:
URL: https://github.com/apache/hudi/pull/3120#discussion_r665350394




##########
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -36,24 +35,21 @@ import org.apache.hudi.common.util.{CommitUtils, 
ReflectionUtils}
 import org.apache.hudi.config.HoodieBootstrapConfig.{BOOTSTRAP_BASE_PATH_PROP, 
BOOTSTRAP_INDEX_CLASS_PROP}
 import org.apache.hudi.config.HoodieWriteConfig
 import org.apache.hudi.exception.HoodieException
-import org.apache.hudi.hive.util.ConfigUtils
 import org.apache.hudi.hive.{HiveSyncConfig, HiveSyncTool}
 import org.apache.hudi.internal.DataSourceInternalWriterHelper
-import org.apache.hudi.keygen.factory.HoodieSparkKeyGeneratorFactory
 import org.apache.hudi.sync.common.AbstractSyncTool
 import org.apache.log4j.LogManager
 import org.apache.spark.SPARK_VERSION
 import org.apache.spark.SparkContext
 import org.apache.spark.api.java.JavaSparkContext
 import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.hudi.HoodieSqlUtils
-import org.apache.spark.sql.internal.SQLConf
-import 
org.apache.spark.sql.internal.StaticSQLConf.SCHEMA_STRING_LENGTH_THRESHOLD
 import org.apache.spark.sql.types.StructType
 import org.apache.spark.sql.{DataFrame, SQLContext, SaveMode, SparkSession}
 
 import scala.collection.JavaConversions._
 import scala.collection.mutable.ListBuffer
+import org.apache.hudi.keygen.factory.HoodieSparkKeyGeneratorFactory
+import org.apache.spark.sql.internal.{SQLConf, StaticSQLConf}

Review comment:
       wrong position

##########
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
##########
@@ -26,6 +26,7 @@ import 
org.apache.hudi.common.model.HoodieTableType.{COPY_ON_WRITE, MERGE_ON_REA
 import org.apache.hudi.common.table.{HoodieTableMetaClient, 
TableSchemaResolver}
 import org.apache.hudi.exception.HoodieException
 import org.apache.hudi.hadoop.HoodieROTablePathFilter
+import org.apache.hudi.hive.util.ConfigUtils

Review comment:
       split it via an empty line

##########
File path: packaging/hudi-flink-bundle/pom.xml
##########
@@ -141,6 +141,13 @@
 
                   <include>org.apache.hbase:hbase-common</include>
                   <include>commons-codec:commons-codec</include>
+                  
<include>org.apache.spark:spark-sql_${scala.binary.version}</include>

Review comment:
       @danny0405 Any thoughts that we can use to optimize? IMO, it seems to be 
not very graceful.

##########
File path: 
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java
##########
@@ -160,6 +168,8 @@ public String toString() {
       + ", supportTimestamp=" + supportTimestamp
       + ", decodePartition=" + decodePartition
       + ", createManagedTable=" + createManagedTable
+      + ", saveAsSparkDataSourceTable=" + syncAsSparkDataSourceTable

Review comment:
       `save` to `sync` ?

##########
File path: 
hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/TestHiveSyncTool.java
##########
@@ -157,17 +160,15 @@ public void testBasicSync(boolean useJdbc, boolean 
useSchemaFromCommitMetadata)
   }
 
   @ParameterizedTest
-  @MethodSource({"useJdbcAndSchemaFromCommitMetadata"})
+  @MethodSource({"useJdbcAndSchemaFromCommitMetadataAndSaveAsDataSource"})
   public void testSyncCOWTableWithProperties(boolean useJdbc,
-                                             boolean 
useSchemaFromCommitMetadata) throws Exception {
+                                             boolean 
useSchemaFromCommitMetadata,
+                                             boolean saveAsDataSourceTable) 
throws Exception {

Review comment:
       `save` -> `sync`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] yanghua commented on a change in pull request #3120: [HUDI-2045] Support Read Hoodie As DataSource Table For Flink And Del…

Reply via email to