[jira] [Created] (SPARK-21259) More rules for scalastyle

2017-06-29 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21259: -- Summary: More rules for scalastyle Key: SPARK-21259 URL: https://issues.apache.org/jira/browse/SPARK-21259 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-21323) Rename sql.catalyst.plans.logical.statsEstimation.Range to ValueInterval

2017-07-05 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21323: -- Summary: Rename sql.catalyst.plans.logical.statsEstimation.Range to ValueInterval Key: SPARK-21323 URL: https://issues.apache.org/jira/browse/SPARK-21323

[jira] [Created] (SPARK-21336) Revise rand comparison in BatchEvalPythonExecSuite

2017-07-06 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21336: -- Summary: Revise rand comparison in BatchEvalPythonExecSuite Key: SPARK-21336 URL: https://issues.apache.org/jira/browse/SPARK-21336 Project: Spark Issue

[jira] [Created] (SPARK-21174) Validate sampling fraction in logical operator level

2017-06-22 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21174: -- Summary: Validate sampling fraction in logical operator level Key: SPARK-21174 URL: https://issues.apache.org/jira/browse/SPARK-21174 Project: Spark

[jira] [Created] (SPARK-21196) Split codegen info of query plan into sequence

2017-06-23 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21196: -- Summary: Split codegen info of query plan into sequence Key: SPARK-21196 URL: https://issues.apache.org/jira/browse/SPARK-21196 Project: Spark Issue

[jira] [Created] (SPARK-21222) Move elimination of Distinct clause from analyzer to optimizer

2017-06-26 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21222: -- Summary: Move elimination of Distinct clause from analyzer to optimizer Key: SPARK-21222 URL: https://issues.apache.org/jira/browse/SPARK-21222 Project: Spark

[jira] [Commented] (SPARK-21222) Move elimination of Distinct clause from analyzer to optimizer

2017-06-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065791#comment-16065791 ] Gengliang Wang commented on SPARK-21222: [~srowen] thanks! I have corrected the statement. >

[jira] [Updated] (SPARK-21222) Move elimination of Distinct clause from analyzer to optimizer

2017-06-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-21222: --- Description: Move elimination of Distinct clause from analyzer to optimizer Distinct clause

[jira] [Created] (SPARK-22037) Collapse Project if it is the child of Aggregate

2017-09-16 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22037: -- Summary: Collapse Project if it is the child of Aggregate Key: SPARK-22037 URL: https://issues.apache.org/jira/browse/SPARK-22037 Project: Spark Issue

[jira] [Created] (SPARK-22263) Refactor deterministic as lazy value

2017-10-12 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22263: -- Summary: Refactor deterministic as lazy value Key: SPARK-22263 URL: https://issues.apache.org/jira/browse/SPARK-22263 Project: Spark Issue Type:

[jira] [Created] (SPARK-21979) Improve QueryPlanConstraints framework

2017-09-12 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21979: -- Summary: Improve QueryPlanConstraints framework Key: SPARK-21979 URL: https://issues.apache.org/jira/browse/SPARK-21979 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-21848) Create trait to identify user-defined functions

2017-08-27 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-21848: -- Summary: Create trait to identify user-defined functions Key: SPARK-21848 URL: https://issues.apache.org/jira/browse/SPARK-21848 Project: Spark Issue

[jira] [Created] (SPARK-22257) Reserve all non-deterministic expressions in ExpressionSet.

2017-10-11 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22257: -- Summary: Reserve all non-deterministic expressions in ExpressionSet. Key: SPARK-22257 URL: https://issues.apache.org/jira/browse/SPARK-22257 Project: Spark

[jira] [Created] (SPARK-22141) Propagate empty relation before checking Cartesian products

2017-09-27 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22141: -- Summary: Propagate empty relation before checking Cartesian products Key: SPARK-22141 URL: https://issues.apache.org/jira/browse/SPARK-22141 Project: Spark

[jira] [Updated] (SPARK-22141) Propagate empty relation before checking Cartesian products

2017-09-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-22141: --- Description: When inferring constraints from children, Join's condition can be simplified as

[jira] [Updated] (SPARK-22615) Handle more cases in PropagateEmptyRelation

2017-11-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-22615: --- Description: Currently, in the optimize rule `PropagateEmptyRelation`, the following cases

[jira] [Updated] (SPARK-22615) Handle more cases in PropagateEmptyRelation

2017-11-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-22615: --- Description: Currently, in the optimize rule `PropagateEmptyRelation`, the following cases

[jira] [Updated] (SPARK-22615) Handle more cases in PropagateEmptyRelation

2017-11-27 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-22615: --- Description: Currently, in the optimize rule `PropagateEmptyRelation`, the following cases

[jira] [Created] (SPARK-22615) Handle more cases in PropagateEmptyRelation

2017-11-27 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22615: -- Summary: Handle more cases in PropagateEmptyRelation Key: SPARK-22615 URL: https://issues.apache.org/jira/browse/SPARK-22615 Project: Spark Issue Type:

[jira] [Created] (SPARK-22763) SHS: Ignore unknown events and parse through the file

2017-12-12 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22763: -- Summary: SHS: Ignore unknown events and parse through the file Key: SPARK-22763 URL: https://issues.apache.org/jira/browse/SPARK-22763 Project: Spark

[jira] [Created] (SPARK-22834) Make insert commands have real children to fix UI issues

2017-12-19 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22834: -- Summary: Make insert commands have real children to fix UI issues Key: SPARK-22834 URL: https://issues.apache.org/jira/browse/SPARK-22834 Project: Spark

[jira] [Created] (SPARK-22559) history server: handle exception on opening corrupted listing.ldb

2017-11-19 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22559: -- Summary: history server: handle exception on opening corrupted listing.ldb Key: SPARK-22559 URL: https://issues.apache.org/jira/browse/SPARK-22559 Project: Spark

[jira] [Created] (SPARK-22719) refactor ConstantPropagation

2017-12-06 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22719: -- Summary: refactor ConstantPropagation Key: SPARK-22719 URL: https://issues.apache.org/jira/browse/SPARK-22719 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-22719) refactor ConstantPropagation

2017-12-06 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-22719: --- Description: The current time complexity of ConstantPropagation is O(n^2), which can be slow

[jira] [Created] (SPARK-24275) Revise doc comments in InputPartition

2018-05-14 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24275: -- Summary: Revise doc comments in InputPartition Key: SPARK-24275 URL: https://issues.apache.org/jira/browse/SPARK-24275 Project: Spark Issue Type:

[jira] [Created] (SPARK-24277) Code clean up in SQL module: HadoopMapReduceCommitProtocol/FileFormatWriter

2018-05-15 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24277: -- Summary: Code clean up in SQL module: HadoopMapReduceCommitProtocol/FileFormatWriter Key: SPARK-24277 URL: https://issues.apache.org/jira/browse/SPARK-24277

[jira] [Updated] (SPARK-24330) Refactor ExecuteWriteTask in FileFormatWriter with DataWriter(V2)

2018-05-21 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24330: --- Description: Refactor ExecuteWriteTask in FileFormatWriter to reduce common logic and

[jira] [Created] (SPARK-24330) Refactor ExecuteWriteTask in FileFormatWriter with DataWriter(V2)

2018-05-21 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24330: -- Summary: Refactor ExecuteWriteTask in FileFormatWriter with DataWriter(V2) Key: SPARK-24330 URL: https://issues.apache.org/jira/browse/SPARK-24330 Project: Spark

[jira] [Created] (SPARK-24365) Add Parquet write benchmark

2018-05-23 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24365: -- Summary: Add Parquet write benchmark Key: SPARK-24365 URL: https://issues.apache.org/jira/browse/SPARK-24365 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-24367) Parquet: use JOB_SUMMARY_LEVEL instead of deprecated flag ENABLE_JOB_SUMMARY

2018-05-23 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24367: -- Summary: Parquet: use JOB_SUMMARY_LEVEL instead of deprecated flag ENABLE_JOB_SUMMARY Key: SPARK-24367 URL: https://issues.apache.org/jira/browse/SPARK-24367

[jira] [Created] (SPARK-24524) Improve aggregateMetrics: less memory usage and loops

2018-06-11 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24524: -- Summary: Improve aggregateMetrics: less memory usage and loops Key: SPARK-24524 URL: https://issues.apache.org/jira/browse/SPARK-24524 Project: Spark

[jira] [Updated] (SPARK-24365) Add data source write benchmark

2018-05-28 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24365: --- Description: Add data source write benchmark. So that it would be easier to measure the

[jira] [Updated] (SPARK-24365) Add data source write benchmark

2018-05-28 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24365: --- Summary: Add data source write benchmark (was: Add Parquet write benchmark) > Add data

[jira] [Created] (SPARK-23005) Improve RDD.take on small number of partitions

2018-01-09 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23005: -- Summary: Improve RDD.take on small number of partitions Key: SPARK-23005 URL: https://issues.apache.org/jira/browse/SPARK-23005 Project: Spark Issue

[jira] [Created] (SPARK-22990) Fix method isFairScheduler in JobsTab and StagesTab

2018-01-08 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22990: -- Summary: Fix method isFairScheduler in JobsTab and StagesTab Key: SPARK-22990 URL: https://issues.apache.org/jira/browse/SPARK-22990 Project: Spark

[jira] [Created] (SPARK-23079) Fix query constraints propagation with aliases

2018-01-15 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23079: -- Summary: Fix query constraints propagation with aliases Key: SPARK-23079 URL: https://issues.apache.org/jira/browse/SPARK-23079 Project: Spark Issue

[jira] [Created] (SPARK-23219) Rename ReadTask to DataReaderFactory

2018-01-25 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23219: -- Summary: Rename ReadTask to DataReaderFactory Key: SPARK-23219 URL: https://issues.apache.org/jira/browse/SPARK-23219 Project: Spark Issue Type:

[jira] [Created] (SPARK-23202) Break down DataSourceV2Writer.commit into two phase

2018-01-24 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23202: -- Summary: Break down DataSourceV2Writer.commit into two phase Key: SPARK-23202 URL: https://issues.apache.org/jira/browse/SPARK-23202 Project: Spark

[jira] [Created] (SPARK-23268) Reorganize packages in data source V2

2018-01-30 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23268: -- Summary: Reorganize packages in data source V2 Key: SPARK-23268 URL: https://issues.apache.org/jira/browse/SPARK-23268 Project: Spark Issue Type:

[jira] [Updated] (SPARK-23202) Add new API in DataSourceWriter: onDataWriterCommit

2018-01-31 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-23202: --- Description: The current DataSourceWriter API makes it hard to implement 

[jira] [Updated] (SPARK-23202) Add new API in DataSourceWriter: onDataWriterCommit

2018-01-31 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-23202: --- Description: The current DataSourceWriter API makes it hard to implement 

[jira] [Updated] (SPARK-23202) Add new API in DataSourceWriter: onDataWriterCommit

2018-01-31 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-23202: --- Summary: Add new API in DataSourceWriter: onDataWriterCommit (was: Break down

[jira] [Updated] (SPARK-23202) Break down DataSourceV2Writer.commit into two phase

2018-01-30 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-23202: --- Affects Version/s: (was: 2.2.1) 2.3.0 > Break down

[jira] [Created] (SPARK-23490) Check storage.locationUri with existing table in CreateTable

2018-02-22 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23490: -- Summary: Check storage.locationUri with existing table in CreateTable Key: SPARK-23490 URL: https://issues.apache.org/jira/browse/SPARK-23490 Project: Spark

[jira] [Created] (SPARK-23507) Migrate file-based data sources to data source v2

2018-02-24 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23507: -- Summary: Migrate file-based data sources to data source v2 Key: SPARK-23507 URL: https://issues.apache.org/jira/browse/SPARK-23507 Project: Spark Issue

[jira] [Created] (SPARK-25002) Avro: revise the output namespace

2018-08-02 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25002: -- Summary: Avro: revise the output namespace Key: SPARK-25002 URL: https://issues.apache.org/jira/browse/SPARK-25002 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-25002) Avro: revise the output record namespace

2018-08-02 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-25002: --- Summary: Avro: revise the output record namespace (was: Avro: revise the output namespace)

[jira] [Created] (SPARK-25104) Validate user specified output schema

2018-08-13 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25104: -- Summary: Validate user specified output schema Key: SPARK-25104 URL: https://issues.apache.org/jira/browse/SPARK-25104 Project: Spark Issue Type:

[jira] [Created] (SPARK-25129) Revert mapping of com.databricks.spark.avro

2018-08-16 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25129: -- Summary: Revert mapping of com.databricks.spark.avro Key: SPARK-25129 URL: https://issues.apache.org/jira/browse/SPARK-25129 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581929#comment-16581929 ] Gengliang Wang commented on SPARK-24924: As package "org.apache.spark.sql.avro" is external

[jira] [Commented] (SPARK-23817) Migrate ORC file format read path to data source V2

2018-08-16 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582005#comment-16582005 ] Gengliang Wang commented on SPARK-23817: [~dongjoon] Thanks! This issue is still open. >

[jira] [Created] (SPARK-25133) Documentaion: AVRO data source guide

2018-08-16 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25133: -- Summary: Documentaion: AVRO data source guide Key: SPARK-25133 URL: https://issues.apache.org/jira/browse/SPARK-25133 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-17 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583483#comment-16583483 ] Gengliang Wang commented on SPARK-24924: [~dongjoon] I see. I am now +1 with adding new 

[jira] [Updated] (SPARK-25129) Make the mapping of com.databricks.spark.avro to built-in module configurable

2018-08-17 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-25129: --- Description: In https://issues.apache.org/jira/browse/SPARK-24924, the data source provider 

[jira] [Created] (SPARK-25099) Generate Avro Binary files in test suite

2018-08-13 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25099: -- Summary: Generate Avro Binary files in test suite Key: SPARK-25099 URL: https://issues.apache.org/jira/browse/SPARK-25099 Project: Spark Issue Type:

[jira] [Updated] (SPARK-25099) Generate Avro Binary files in test suite

2018-08-13 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-25099: --- Description: In PR [https://github.com/apache/spark/pull/21984] and

[jira] [Updated] (SPARK-24774) support reading AVRO logical types - Decimal

2018-08-08 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24774: --- Summary: support reading AVRO logical types - Decimal (was: support reading AVRO logical

[jira] [Updated] (SPARK-24772) support reading AVRO logical types - Date

2018-08-08 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24772: --- Summary: support reading AVRO logical types - Date (was: support reading AVRO logical

[jira] [Created] (SPARK-25160) Remove sql configuration spark.sql.avro.outputTimestampType

2018-08-20 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25160: -- Summary: Remove sql configuration spark.sql.avro.outputTimestampType Key: SPARK-25160 URL: https://issues.apache.org/jira/browse/SPARK-25160 Project: Spark

[jira] [Resolved] (SPARK-25129) Make the mapping of com.databricks.spark.avro to built-in module configurable

2018-08-23 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-25129. Resolution: Fixed > Make the mapping of com.databricks.spark.avro to built-in module

[jira] [Created] (SPARK-24876) Remove SerializableSchema and use json format string schema

2018-07-20 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24876: -- Summary: Remove SerializableSchema and use json format string schema Key: SPARK-24876 URL: https://issues.apache.org/jira/browse/SPARK-24876 Project: Spark

[jira] [Updated] (SPARK-24876) Simplify schema serialization

2018-07-20 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24876: --- Summary: Simplify schema serialization (was: Remove SerializableSchema and use json format

[jira] [Resolved] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-24770. Resolution: Duplicate The function `from_avro` and `to_avro` can be added in one PR: #

[jira] [Issue Comment Deleted] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24770: --- Comment: was deleted (was: The function `from_avro` and `to_avro` can be added in one PR:

[jira] [Resolved] (SPARK-24769) Support for parsing AVRO binary column

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-24769. Resolution: Duplicate > Support for parsing AVRO binary column >

[jira] [Commented] (SPARK-24769) Support for parsing AVRO binary column

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544636#comment-16544636 ] Gengliang Wang commented on SPARK-24769: The function `from_avro` and `to_avro` can be added in

[jira] [Commented] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544637#comment-16544637 ] Gengliang Wang commented on SPARK-24770: The function `from_avro` and `to_avro` can be added in

[jira] [Commented] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544639#comment-16544639 ] Gengliang Wang commented on SPARK-24770: [~felipesmmelo] Thank you. But I have created a PR: 

[jira] [Commented] (SPARK-24769) Support for parsing AVRO binary column

2018-07-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544640#comment-16544640 ] Gengliang Wang commented on SPARK-24769: [~felipesmmelo] Thank you. But I have created a PR: 

[jira] [Created] (SPARK-24811) Add function `from_avro` and `to_avro`

2018-07-15 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24811: -- Summary: Add function `from_avro` and `to_avro` Key: SPARK-24811 URL: https://issues.apache.org/jira/browse/SPARK-24811 Project: Spark Issue Type:

[jira] [Created] (SPARK-24883) Remove implicit class AvroDataFrameWriter/AvroDataFrameReader

2018-07-22 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24883: -- Summary: Remove implicit class AvroDataFrameWriter/AvroDataFrameReader Key: SPARK-24883 URL: https://issues.apache.org/jira/browse/SPARK-24883 Project: Spark

[jira] [Created] (SPARK-24887) Use SerializableConfiguration in Spark util

2018-07-23 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24887: -- Summary: Use SerializableConfiguration in Spark util Key: SPARK-24887 URL: https://issues.apache.org/jira/browse/SPARK-24887 Project: Spark Issue Type:

[jira] [Created] (SPARK-24858) Avoid unnecessary parquet footer reads

2018-07-19 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24858: -- Summary: Avoid unnecessary parquet footer reads Key: SPARK-24858 URL: https://issues.apache.org/jira/browse/SPARK-24858 Project: Spark Issue Type:

[jira] [Created] (SPARK-24919) Scala linter rule for sparkContext.hadoopConfiguration

2018-07-25 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24919: -- Summary: Scala linter rule for sparkContext.hadoopConfiguration Key: SPARK-24919 URL: https://issues.apache.org/jira/browse/SPARK-24919 Project: Spark

[jira] [Created] (SPARK-25305) Respect attribute name in `CollapseProject`

2018-09-01 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25305: -- Summary: Respect attribute name in `CollapseProject` Key: SPARK-25305 URL: https://issues.apache.org/jira/browse/SPARK-25305 Project: Spark Issue Type:

[jira] [Updated] (SPARK-25305) Respect attribute name in `CollapseProject` and `ColumnPruning`

2018-09-01 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-25305: --- Summary: Respect attribute name in `CollapseProject` and `ColumnPruning` (was: Respect

[jira] [Updated] (SPARK-25305) Respect attribute name in `CollapseProject`

2018-09-01 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-25305: --- Description: Currently in optimizer rule `CollapseProject`, the lower level project is

[jira] [Commented] (SPARK-24771) Upgrade AVRO version from 1.7.7 to 1.8

2018-09-06 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605308#comment-16605308 ] Gengliang Wang commented on SPARK-24771: [~vanzin] I am OK with either way. Shading Avro 1.8 in

[jira] [Created] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24768: -- Summary: Have a built-in AVRO data source implementation Key: SPARK-24768 URL: https://issues.apache.org/jira/browse/SPARK-24768 Project: Spark Issue

[jira] [Created] (SPARK-24771) Upgrade AVRO version from 1.7.7 to 1.8

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24771: -- Summary: Upgrade AVRO version from 1.7.7 to 1.8 Key: SPARK-24771 URL: https://issues.apache.org/jira/browse/SPARK-24771 Project: Spark Issue Type:

[jira] [Created] (SPARK-24769) Support for parsing AVRO string column

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24769: -- Summary: Support for parsing AVRO string column Key: SPARK-24769 URL: https://issues.apache.org/jira/browse/SPARK-24769 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Attachment: Design doc-Spark Avro.pdf > Have a built-in AVRO data source implementation >

[jira] [Created] (SPARK-24772) support reading AVRO logical types - Decimal

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24772: -- Summary: support reading AVRO logical types - Decimal Key: SPARK-24772 URL: https://issues.apache.org/jira/browse/SPARK-24772 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24776) AVRO unit test: use SQLTestUtils and Replace deprecated methods

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24776: --- Summary: AVRO unit test: use SQLTestUtils and Replace deprecated methods (was: Improve

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Attachment: Built-in AVRO Data Source In Spark 2.4.pdf > Have a built-in AVRO data source

[jira] [Created] (SPARK-24776) Improve AVRO unit test: use

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24776: -- Summary: Improve AVRO unit test: use Key: SPARK-24776 URL: https://issues.apache.org/jira/browse/SPARK-24776 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Description: Apache Avro (https://avro.apache.org) is a popular data serialization format.

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Attachment: (was: Design doc-Spark Avro.pdf) > Have a built-in AVRO data source

[jira] [Updated] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24770: --- Summary: Supporting to convert a column into binary of AVRO format (was: Supporting to

[jira] [Updated] (SPARK-24769) Support for parsing AVRO binary column

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24769: --- Summary: Support for parsing AVRO binary column (was: Support for parsing AVRO string

[jira] [Created] (SPARK-24777) Refactor AVRO read/write benchmark

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24777: -- Summary: Refactor AVRO read/write benchmark Key: SPARK-24777 URL: https://issues.apache.org/jira/browse/SPARK-24777 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-24770) Supporting to convert a column into binary of avro format

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24770: -- Summary: Supporting to convert a column into binary of avro format Key: SPARK-24770 URL: https://issues.apache.org/jira/browse/SPARK-24770 Project: Spark

[jira] [Created] (SPARK-24775) support reading AVRO logical types - Duration

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24775: -- Summary: support reading AVRO logical types - Duration Key: SPARK-24775 URL: https://issues.apache.org/jira/browse/SPARK-24775 Project: Spark Issue

[jira] [Created] (SPARK-24774) support reading AVRO logical types - Time with different precisions

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24774: -- Summary: support reading AVRO logical types - Time with different precisions Key: SPARK-24774 URL: https://issues.apache.org/jira/browse/SPARK-24774 Project:

[jira] [Created] (SPARK-24773) support reading AVRO logical types - Timestamp with different precisions

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24773: -- Summary: support reading AVRO logical types - Timestamp with different precisions Key: SPARK-24773 URL: https://issues.apache.org/jira/browse/SPARK-24773

[jira] [Created] (SPARK-24792) Add API `.avro` in DataFrameReader/DataFrameWriter

2018-07-11 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24792: -- Summary: Add API `.avro` in DataFrameReader/DataFrameWriter Key: SPARK-24792 URL: https://issues.apache.org/jira/browse/SPARK-24792 Project: Spark Issue

[jira] [Created] (SPARK-24800) Refactor Avro Serializer and Deserializer

2018-07-13 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24800: -- Summary: Refactor Avro Serializer and Deserializer Key: SPARK-24800 URL: https://issues.apache.org/jira/browse/SPARK-24800 Project: Spark Issue Type:

[jira] [Created] (SPARK-23624) Revise doc of method pushFilters

2018-03-07 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23624: -- Summary: Revise doc of method pushFilters Key: SPARK-23624 URL: https://issues.apache.org/jira/browse/SPARK-23624 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-23896) Improve PartitioningAwareFileIndex

2018-04-08 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23896: -- Summary: Improve PartitioningAwareFileIndex Key: SPARK-23896 URL: https://issues.apache.org/jira/browse/SPARK-23896 Project: Spark Issue Type:

[jira] [Created] (SPARK-24045) Create base class for file data source v2

2018-04-22 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24045: -- Summary: Create base class for file data source v2 Key: SPARK-24045 URL: https://issues.apache.org/jira/browse/SPARK-24045 Project: Spark Issue Type:

  1   2   3   4   5   6   7   8   9   10   >