[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-28 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16030 Or the thing that we should fix here is that if a partition column is found also as part of the dataSchema, to throw an exception. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-28 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16030 @maropu I wouldn't say this is a regression. I would say that this working for 2.0.2 was a bug in 2.0.2. If you want the column `a` to be interpreted as a `LongType` instead of `IntegerType`, you

[GitHub] spark pull request #16030: [SPARK-18108][SQL] Fix a bug to fail partition sc...

2016-11-28 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16030#discussion_r89845728 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -969,4 +969,15 @@ class

[GitHub] spark issue #15896: [SPARK-18465] Add 'IF EXISTS' clause to 'UNCACHE' to not...

2016-11-22 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15896 @hvanhovell Done! Thanks for the quick review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15896: [SPARK-18465] Add 'IF EXISTS' clause to 'UNCACHE' to not...

2016-11-22 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15896 Hey @andrewor14 ! I went with your way. @hvanhovell can you take a quick look please? I would really like this to be available in Spark 2.1 (even though it is a new API) --- If your project is set

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r89030586 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -84,30 +84,95 @@ case class DataSource( private

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r89030610 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -84,30 +84,95 @@ case class DataSource( private

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r89030487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -84,30 +84,95 @@ case class DataSource( private

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r89030464 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -84,30 +84,95 @@ case class DataSource( private

[GitHub] spark issue #15951: [SPARK-18510] Fix data corruption from inferred partitio...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15951 Thanks @tejasapatil for the review. Addressed your comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #15951: [SPARK-18510] Fix data corruption from inferred partitio...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15951 @ericl I feel that would probably break 90% of production Spark jobs out there, therefore am a bit scared of something radical. I agree, it's confusing and annoying --- If your project is set up

[GitHub] spark issue #15949: [SPARK-18339] [SQL] Don't push down current_timestamp fo...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15949 Left several comments that all are related to each other. @zsxwing I would like your feedback on those as well in order to not make @tcondie make too many changes flipping timestamp precision

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89025767 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -344,8 +370,11 @@ class StreamExecution

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89025593 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -72,6 +72,26 @@ case class CurrentTimestamp

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89025088 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -72,6 +72,26 @@ case class CurrentTimestamp

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15921 Hallelujah! @zsxwing shall we merge this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89017675 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -38,6 +40,26 @@ import

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89017544 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -422,6 +451,8 @@ class StreamExecution( val

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89017366 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -72,6 +72,28 @@ case class CurrentTimestamp

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15921 thanks @gatorsmile and @tdas. I addressed your comments. The semantics look a lot cleaner now. That doesn't still mean it's clean though :P --- If your project is set up for it, you can reply

[GitHub] spark issue #15942: [SPARK-18407] Inferred partition columns cause assertion...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15942 Closing this in favor of #15951 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15942: [SPARK-18407] Inferred partition columns cause as...

2016-11-21 Thread brkyvz
Github user brkyvz closed the pull request at: https://github.com/apache/spark/pull/15942 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r89012272 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -291,22 +360,24 @@ case class DataSource

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r89007519 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -84,30 +84,96 @@ case class DataSource( private

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r88994472 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -84,30 +84,90 @@ case class DataSource( private

[GitHub] spark pull request #15949: [SPARK-18339] [SQL] Don't push down current_times...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r88941686 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -422,6 +432,7 @@ class StreamExecution( val

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r88934282 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -272,14 +309,20 @@ case class DataSource

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-21 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15951#discussion_r88934128 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -573,4 +573,39 @@ class DataFrameReaderWriterSuite

[GitHub] spark issue #15951: [SPARK-18510] Fix data corruption from inferred partitio...

2016-11-21 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15951 True. But there's no reason "part" and "id" can't be strings right? On Nov 21, 2016 12:16 AM, "Xiao Li" <notificati...@github.com> wrote: >

[GitHub] spark issue #15951: [SPARK-18510] Fix data corruption from inferred partitio...

2016-11-20 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15951 cc @rxin @marmbrus Don't know who's the best person to look at this, but git blame sais I mainly changed your code :) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #15951: [SPARK-18510] Fix data corruption from inferred p...

2016-11-20 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15951 [SPARK-18510] Fix data corruption from inferred partition column dataTypes ## What changes were proposed in this pull request? ### The Issue If I specify my schema when doing

[GitHub] spark pull request #15942: [SPARK-18407] Inferred partition columns cause as...

2016-11-19 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15942 [SPARK-18407] Inferred partition columns cause assertion error in StructuredStreaming ## What changes were proposed in this pull request? It turns out we are a bit enthusiastic when

[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Optimize BlockMatrix multiplica...

2016-11-18 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15730 Hi @WeichenXu123 Thank you for this PR. Sorry for taking so long to get back to you. Your optimization would be very helpful. I have a couple thoughts though. Your examples always take into account

[GitHub] spark issue #15896: [SPARK-18465] Uncache table shouldn't throw an exception...

2016-11-17 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15896 On hold on my side. Will try to get back to it On Nov 17, 2016 3:31 PM, "Dongjoon Hyun" <notificati...@github.com> wrote: > Hi, @brkyvz <https://github.com/b

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-17 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15921 cc @davies for PySpark changes cc @liancheng for `checkpoint` API and javadoc update cc @marmbrus for `withWatermark` API. Question here: should we throw an analysis exception if the Dataset

[GitHub] spark pull request #15921: Add missing python APIs: withWatermark and checkp...

2016-11-17 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15921 Add missing python APIs: withWatermark and checkpoint to dataframe ## What changes were proposed in this pull request? This PR adds two of the newly added methods of `Dataset`s to Python

[GitHub] spark issue #15896: [SPARK-18465] Uncache table shouldn't throw an exception...

2016-11-16 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15896 @gatorsmile Talking offline with several people, I may put this PR on hold for now since it is a behavior change. I guess it would be better to go with Options 1 or 2 that I defined in the PR

[GitHub] spark pull request #15909: [SPARK-18475] Be able to increase parallelism in ...

2016-11-16 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15909 [SPARK-18475] Be able to increase parallelism in StructuredStreaming Kafka source ## What changes were proposed in this pull request? This PR adds the configuration `numPartitions

[GitHub] spark issue #15896: [SPARK-18465] Uncache table shouldn't throw an exception...

2016-11-15 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15896 cc @gatorsmile I changed a test you added. Do you have any strong feelings on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #15896: [SPARK-18465] Uncache table shouldn't throw an ex...

2016-11-15 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15896 [SPARK-18465] Uncache table shouldn't throw an exception when table doesn't exist ## What changes were proposed in this pull request? While this behavior is debatable, consider

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-11 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15801 @tdas Addressed your comments. Test time increased to 2.5 seconds though, fyi. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #15801: [SPARK-18337] Complete mode memory sinks should be able ...

2016-11-09 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15801 @tdas Added test for the data as well --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15815: [DOCS][SPARK-18365] Documentation is Switched on Sample ...

2016-11-08 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15815 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15815: [DOCS][SPARK-18365] Documentation is Switched on Sample ...

2016-11-08 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15815 @anabranch I don't see how the documentation was wrong. The second argument doesn't take the seed as a parameter, therefore the seed is random --- If your project is set up for it, you can reply

[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...

2016-11-08 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15804 @tdas Addressed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15806: [SPARK-18345][STRUCTURED STREAMING] Structured Streaming...

2016-11-08 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15806 I think the change you should make should be for: https://github.com/apache/spark/pull/15806/files#diff-7dc261474784c6402f7020ffe7f61038R212 where you always append `file:///` otherwise

[GitHub] spark issue #15806: [SPARK-18345][STRUCTURED STREAMING] Structured Streaming...

2016-11-07 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15806 Hi @oza, thank you for this patch, however I'm not sure if this patch fixes anything. What was your problem, what didn't work for you? --- If your project is set up for it, you can reply

[GitHub] spark pull request #15806: [SPARK-18345][STRUCTURED STREAMING] Structured St...

2016-11-07 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15806#discussion_r86933542 --- Diff: examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredNetworkWordCount.scala --- @@ -68,6 +70,8 @@ object

[GitHub] spark pull request #15806: [SPARK-18345][STRUCTURED STREAMING] Structured St...

2016-11-07 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15806#discussion_r86933461 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -219,10 +219,11 @@ class StreamingQueryManager private

[GitHub] spark issue #15804: [SPARK-18342] Make rename failures fatal in HDFSBackedSt...

2016-11-07 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15804 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15804: [SPARK-18342] Make rename failures fatal in HDFSB...

2016-11-07 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15804 [SPARK-18342] Make rename failures fatal in HDFSBackedStateStore ## What changes were proposed in this pull request? If the rename operation in the state store fails (`fs.rename` returns

[GitHub] spark pull request #15786: [SPARK-18261][Structured Streaming] Add statistic...

2016-11-07 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15786#discussion_r86903334 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala --- @@ -212,4 +212,8 @@ class MemorySink(val schema: StructType

[GitHub] spark pull request #15801: [SPARK-18337] Complete mode memory sinks should b...

2016-11-07 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15801 [SPARK-18337] Complete mode memory sinks should be able to recover from checkpoints ## What changes were proposed in this pull request? It would be nice if memory sinks can also recover

[GitHub] spark issue #15786: [SPARK-18261][Structured Streaming] Add statistics to Me...

2016-11-07 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15786 Thanks @lw-lin ! Left one last comment. @davies Can you also please take a look? I think you implemented most of the statistics code --- If your project is set up for it, you can reply

[GitHub] spark pull request #15786: [SPARK-18261][Structured Streaming] Add statistic...

2016-11-05 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15786#discussion_r86667129 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/MemorySinkSuite.scala --- @@ -187,6 +187,31 @@ class MemorySinkSuite extends StreamTest

[GitHub] spark issue #15771: [SPARK-18260] Make from_json null safe

2016-11-04 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15771 @marmbrus Addressed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15771: [SPARK-18260] Make from_json null safe

2016-11-04 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15771 [SPARK-18260] Make from_json null safe ## What changes were proposed in this pull request? `from_json` is currently not safe against `null` rows. This PR adds a fix and a regression test

[GitHub] spark issue #15702: [SPARK-18124] Observed delay based Event Time Watermarks

2016-11-01 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15702 A very dumb question (I apologize), there is nothing stopping a user to actually use processing time as watermarks with this API either. One can easily do `df.withColumn("time

[GitHub] spark issue #15632: [SPARK-18105] fix buffer overflow in LZ4

2016-10-25 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15632 @srowen I think it's a matter of how fast upstream publishes a new version, and we can make sure that everything works --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source trait ...

2016-10-24 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/14553 LGTM as well! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-24 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r84811441 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala --- @@ -111,6 +126,23 @@ case class MemoryStream[A : Encoder](id

[GitHub] spark pull request #15470: [SPARK-17921] failfast on checkpointLocation spec...

2016-10-13 Thread brkyvz
Github user brkyvz closed the pull request at: https://github.com/apache/spark/pull/15470 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15235: [SPARK-17661][SQL] Consolidate various listLeafFiles imp...

2016-10-13 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15235 @petermaxlee Thanks for making the change. This LGTM now that the fix for `SPARK-17599` is in the right place. The rest is just moving around, consolidating old code. --- If your project is set up

[GitHub] spark issue #15470: [SPARK-17921] failfast on checkpointLocation specified f...

2016-10-13 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15470 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15470: [SPARK-17921] failfast on checkpointLocation spec...

2016-10-13 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15470 [SPARK-17921] failfast on checkpointLocation specified for memory streams ## What changes were proposed in this pull request? The checkpointLocation option in memory streams

[GitHub] spark issue #15437: [SPARK-17876] Write StructuredStreaming WAL to a stream ...

2016-10-12 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15437 Thanks @zsxwing addressed your comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82890911 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82890581 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -516,12 +563,127 @@ class StreamExecution

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82890465 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -105,11 +105,21 @@ class StreamExecution( var

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82890127 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark issue #15437: [SPARK-17876] Write StructuredStreaming WAL to a stream ...

2016-10-11 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15437 cc @tdas @zsxwing Would one of you want to look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82872469 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -176,7 +184,9 @@ class StreamExecution

[GitHub] spark pull request #15437: [SPARK-17876] Write StructuredStreaming WAL to a ...

2016-10-11 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15437 [SPARK-17876] Write StructuredStreaming WAL to a stream instead of materializing all at once ## What changes were proposed in this pull request? The CompactibleFileStreamLog materializes

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82709478 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82709451 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82709228 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82708496 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -530,7 +692,7 @@ class StreamExecution( case

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82708346 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -221,8 +247,15 @@ class StreamExecution

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82708289 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -176,7 +184,9 @@ class StreamExecution

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82708317 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -176,7 +184,9 @@ class StreamExecution

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82708148 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -176,7 +184,9 @@ class StreamExecution

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82707725 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala --- @@ -56,7 +57,12 @@ case class StateStoreRestoreExec

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82707355 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82707181 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82707120 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82706974 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82706923 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82706830 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82706767 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-10 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82706656 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated

[GitHub] spark pull request #15380: Backport [SPARK-15062][SQL] fix list type infer s...

2016-10-06 Thread brkyvz
Github user brkyvz closed the pull request at: https://github.com/apache/spark/pull/15380 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15380: Backport [SPARK-15062][SQL] fix list type infer serializ...

2016-10-06 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15380 cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15380: Backport [SPARK-15062][SQL] fix list type infer s...

2016-10-06 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15380 Backport [SPARK-15062][SQL] fix list type infer serializer issue ## What changes were proposed in this pull request? This backports https://github.com/apache/spark/commit

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80582937 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -82,73 +85,185 @@ class ListingFileCatalog

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80583033 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalogSuite.scala --- @@ -0,0 +1,34 @@ +/* + * Licensed

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80582702 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -82,73 +85,185 @@ class ListingFileCatalog

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80583392 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -82,73 +83,177 @@ class ListingFileCatalog

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80573942 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -82,73 +85,185 @@ class ListingFileCatalog

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80583326 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -82,73 +85,185 @@ class ListingFileCatalog

[GitHub] spark pull request #15235: [SPARK-17661][SQL] Consolidate various listLeafFi...

2016-09-26 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15235#discussion_r80574449 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -82,73 +85,185 @@ class ListingFileCatalog

<    2   3   4   5   6   7   8   9   10   11   >