[jira] [Updated] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-08-14 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-8887:
---
Affects Version/s: 1.5.0

 Explicitly define which data types can be used as dynamic partition columns
 ---

 Key: SPARK-8887
 URL: https://issues.apache.org/jira/browse/SPARK-8887
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 1.4.0, 1.5.0
Reporter: Cheng Lian
Assignee: Yijie Shen
 Fix For: 1.6.0


 {{InsertIntoHadoopFsRelation}} implements Hive compatible dynamic 
 partitioning insertion, which uses {{String.valueOf}} to write encode 
 partition column values into dynamic partition directories. This actually 
 limits the data types that can be used in partition column. For example, 
 string representation of {{StructType}} values is not well defined. However, 
 this limitation is not explicitly enforced.
 There are several things we can improve:
 # Enforce dynamic column data type requirements by adding analysis rules and 
 throws {{AnalysisException}} when violation occurs.
 # Abstract away string representation of various data types, so that we don't 
 need to convert internal representation types (e.g. {{UTF8String}}) to 
 external types (e.g. {{String}}). A set of Hive compatible implementations 
 should be provided to ensure compatibility with Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-08-13 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-8887:
---
Parent Issue: SPARK-9932  (was: SPARK-5180)

 Explicitly define which data types can be used as dynamic partition columns
 ---

 Key: SPARK-8887
 URL: https://issues.apache.org/jira/browse/SPARK-8887
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 1.4.0
Reporter: Cheng Lian
Assignee: Cheng Lian

 {{InsertIntoHadoopFsRelation}} implements Hive compatible dynamic 
 partitioning insertion, which uses {{String.valueOf}} to write encode 
 partition column values into dynamic partition directories. This actually 
 limits the data types that can be used in partition column. For example, 
 string representation of {{StructType}} values is not well defined. However, 
 this limitation is not explicitly enforced.
 There are several things we can improve:
 # Enforce dynamic column data type requirements by adding analysis rules and 
 throws {{AnalysisException}} when violation occurs.
 # Abstract away string representation of various data types, so that we don't 
 need to convert internal representation types (e.g. {{UTF8String}}) to 
 external types (e.g. {{String}}). A set of Hive compatible implementations 
 should be provided to ensure compatibility with Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-08-03 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-8887:
--
Target Version/s: 1.5.0  (was: 1.6.0)

 Explicitly define which data types can be used as dynamic partition columns
 ---

 Key: SPARK-8887
 URL: https://issues.apache.org/jira/browse/SPARK-8887
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 1.4.0
Reporter: Cheng Lian

 {{InsertIntoHadoopFsRelation}} implements Hive compatible dynamic 
 partitioning insertion, which uses {{String.valueOf}} to write encode 
 partition column values into dynamic partition directories. This actually 
 limits the data types that can be used in partition column. For example, 
 string representation of {{StructType}} values is not well defined. However, 
 this limitation is not explicitly enforced.
 There are several things we can improve:
 # Enforce dynamic column data type requirements by adding analysis rules and 
 throws {{AnalysisException}} when violation occurs.
 # Abstract away string representation of various data types, so that we don't 
 need to convert internal representation types (e.g. {{UTF8String}}) to 
 external types (e.g. {{String}}). A set of Hive compatible implementations 
 should be provided to ensure compatibility with Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-08-03 Thread Michael Armbrust (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust updated SPARK-8887:

Target Version/s: 1.6.0  (was: 1.5.0)

 Explicitly define which data types can be used as dynamic partition columns
 ---

 Key: SPARK-8887
 URL: https://issues.apache.org/jira/browse/SPARK-8887
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 1.4.0
Reporter: Cheng Lian

 {{InsertIntoHadoopFsRelation}} implements Hive compatible dynamic 
 partitioning insertion, which uses {{String.valueOf}} to write encode 
 partition column values into dynamic partition directories. This actually 
 limits the data types that can be used in partition column. For example, 
 string representation of {{StructType}} values is not well defined. However, 
 this limitation is not explicitly enforced.
 There are several things we can improve:
 # Enforce dynamic column data type requirements by adding analysis rules and 
 throws {{AnalysisException}} when violation occurs.
 # Abstract away string representation of various data types, so that we don't 
 need to convert internal representation types (e.g. {{UTF8String}}) to 
 external types (e.g. {{String}}). A set of Hive compatible implementations 
 should be provided to ensure compatibility with Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-07-08 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-8887:
---
Issue Type: Sub-task  (was: Improvement)
Parent: SPARK-5180

 Explicitly define which data types can be used as dynamic partition columns
 ---

 Key: SPARK-8887
 URL: https://issues.apache.org/jira/browse/SPARK-8887
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 1.4.0
Reporter: Cheng Lian

 {{InsertIntoHadoopFsRelation}} implements Hive compatible dynamic 
 partitioning insertion, which uses {{String.valueOf}} to write encode 
 partition column values into dynamic partition directories. This actually 
 limits the data types that can be used in partition column. For example, 
 string representation of {{StructType}} values is not well defined. However, 
 this limitation is not explicitly enforced.
 There are several things we can improve:
 # Enforce dynamic column data type requirements by adding analysis rules and 
 throws {{AnalysisException}} when violation occurs.
 # Abstract away string representation of various data types, so that we don't 
 need to convert internal representation types (e.g. {{UTF8String}}) to 
 external types (e.g. {{String}}). A set of Hive compatible implementations 
 should be provided to ensure compatibility with Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org