[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-9407:

Fix Version/s: (was: 1.14.0)
   1.15.0

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available, 
> usability
> Fix For: 1.15.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2021-05-26 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9407:
--
  Labels: auto-deprioritized-major pull-request-available usability  (was: 
pull-request-available stale-major usability)
Priority: Minor  (was: Major)

This issue was labeled "stale-major" 7 ago and has not received any updates so 
it is being deprioritized. If this ticket is actually Major, please raise the 
priority and ask a committer to assign you the issue or revive the public 
discussion.


> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available, 
> usability
> Fix For: 1.14.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2021-04-29 Thread Dawid Wysakowicz (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-9407:

Fix Version/s: (was: 1.13.0)
   1.14.0

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Major
>  Labels: pull-request-available, stale-major, usability
> Fix For: 1.14.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2021-04-22 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-9407:
--
Labels: pull-request-available stale-major usability  (was: 
pull-request-available usability)

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Major
>  Labels: pull-request-available, stale-major, usability
> Fix For: 1.13.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2020-12-07 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-9407:
--
Fix Version/s: (was: 1.12.0)
   1.13.0

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.13.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2020-06-02 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated FLINK-9407:
--
Fix Version/s: (was: 1.11.0)
   1.12.0

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.12.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2020-01-15 Thread Stephan Ewen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-9407:

Labels: pull-request-available usability  (was: pull-request-available)

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.11.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2020-01-15 Thread Stephan Ewen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-9407:

Fix Version/s: 1.11.0

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9407:
--
Labels: pull-request-available  (was: )

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-08 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9407:

Description: 
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we had written down before. But I will give more tests in 
the next couple of days. Including the performance under compression with short 
checkpoint intervals. And more UTs.
{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}

  was:
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we had written down before. But I will give more tests in 
the next couple of days. Including the performance under compression. And more 
UTs.
{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}


> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-08 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9407:

Labels:   (was: patch-available pull-request-available)

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression. And 
> more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-08 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9407:

Description: 
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we had written down before. But I will give more tests in 
the next couple of days. Including the performance under compression. And more 
UTs.
{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}

  was:
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we written down before. But I will give more tests in the 
next couple of days. Including the performance under compression. And more UTs.

{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}



> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: patch-available, pull-request-available
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression. And 
> more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-08 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9407:

Description: 
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we written down before. But I will give more tests in the 
next couple of days. Including the performance under compression. And more UTs.

{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}


  was:
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we written down before. But I will give more tests in the 
next couple of days. Including the performance under compression. And more UT 
tests.

{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}



> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: patch-available, pull-request-available
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we written down before. But I will give more tests in the 
> next couple of days. Including the performance under compression. And more 
> UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-08 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9407:

Description: 
Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.

Below, FYI.

I tested the PR and verify the results with spark sql. Obviously, we can get 
the results of what we written down before. But I will give more tests in the 
next couple of days. Including the performance under compression. And more UT 
tests.

{code:java}
scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala>

scala> res1.registerTempTable("tablerice")
warning: there was one deprecation warning; re-run with -deprecation for details

scala> spark.sql("select * from tablerice")
res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]

scala> res3.show(3)
+-+---+---+
| name|age|married|
+-+---+---+
|Sagar| 26|  false|
|Sagar| 30|  false|
|Sagar| 34|  false|
+-+---+---+
only showing top 3 rows
{code}


  was:Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
{{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling sink.


> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: patch-available, pull-request-available
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we written down before. But I will give more tests in the 
> next couple of days. Including the performance under compression. And more UT 
> tests.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-07-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9407:
--
Labels: patch-available pull-request-available  (was: patch-available)

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: patch-available, pull-request-available
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-06-26 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9407:

Labels: patch-available  (was: )

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: zhangminglei
>Assignee: zhangminglei
>Priority: Major
>  Labels: patch-available
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2018-05-23 Thread mingleizhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingleizhang updated FLINK-9407:

Description: Currently, we only support {{StringWriter}}, 
{{SequenceFileWriter}} and {{AvroKeyValueSinkWriter}}. I would suggest add an 
orc writer for rolling sink.  (was: Currently, we only support 
{{StringWriter}}, {{SequenceFileWriter}} and {{AvroKeyValueSinkWriter}}. I 
would suggest add orc writer for rolling sink.)

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Major
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)