[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-25 Thread budde
GitHub user budde opened a pull request: https://github.com/apache/spark/pull/5188 [SPARK-6538][SQL] Add missing nullable Metastore fields when merging a Parquet schema When Spark SQL infers a schema for a DataFrame, it will take the union of all field types present

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-26 Thread budde
GitHub user budde opened a pull request: https://github.com/apache/spark/pull/5214 [SPARK-6538][SQL] Add missing nullable Metastore fields when merging a Parquet schema Opening to replace #5188. When Spark SQL infers a schema for a DataFrame, it will take the union of all

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-26 Thread budde
Github user budde closed the pull request at: https://github.com/apache/spark/pull/5188 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-26 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/5188#issuecomment-86625383 Thanks for the input, @marmbrus and @liancheng. I'll resolve the conflicts and open a new PR against master. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-27 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/5214#discussion_r27315420 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala --- @@ -775,6 +777,32 @@ private[sql] object ParquetRelation2 extends Logging

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-27 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/5214#discussion_r27311560 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala --- @@ -775,6 +777,32 @@ private[sql] object ParquetRelation2 extends Logging

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-27 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/5214#discussion_r27332712 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala --- @@ -775,6 +777,32 @@ private[sql] object ParquetRelation2 extends Logging

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-27 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/5214#discussion_r27332969 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala --- @@ -775,6 +777,32 @@ private[sql] object ParquetRelation2 extends Logging

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-26 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/5214#issuecomment-86699105 Taking a look at why these tests failed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-6538][SQL] Add missing nullable Metasto...

2015-03-26 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/5214#issuecomment-86721204 I must've accidentally run the tests on an old build artifact before opening this PR. It turns out that tests included #5141 expect failure in scenarios now permitted

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-01 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/11012#issuecomment-178314141 Pinging @andrewor14 , the original implementor of unrollSafely(), for any potential feedback. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-01 Thread budde
GitHub user budde opened a pull request: https://github.com/apache/spark/pull/11012 [SPARK-13122] Fix race condition in MemoryStore.unrollSafely() https://issues.apache.org/jira/browse/SPARK-13122 A race condition can occur in MemoryStore's unrollSafely() method if two

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-02 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/11012#issuecomment-178853877 From Jenkins output: >Fetching upstream changes from https://github.com/apache/spark.git > git --version # timeout=10 > git fetch --tags -

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-02 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/11012#discussion_r51645315 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -304,10 +309,9 @@ private[spark] class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-02 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/11012#discussion_r51643796 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -304,10 +309,9 @@ private[spark] class MemoryStore(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-02 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/11012#issuecomment-178940544 Looks like a bunch of Spark SQL/Hive tests are failing due to this error: >Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Job aborted

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-02 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/11012#issuecomment-178929153 Latest change is looking good on my end. No unroll memory is being leaked. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-13122] Fix race condition in MemoryStor...

2016-02-02 Thread budde
Github user budde commented on the pull request: https://github.com/apache/spark/pull/11012#issuecomment-178766741 Updated PR with new implementation that uses a counter variable instead of requiring the whole method to be atomic. --- If your project is set up for it, you can reply

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-01 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Pinging @brkyvz as well, who also appears to have reviewed kinesis-asl changes in the past --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-02 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99217534 --- Diff: pom.xml --- @@ -146,6 +146,8 @@ hadoop2 0.7.1 1.6.2 + +1.10.61 --- End diff -- I believe

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-03 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 Relevant part of [Jenkins output](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72326/console) for SparkR tests: ``` Error: processing vignette 'sparkr-vignettes.Rmd

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-03 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 Pinging @ericl, @cloud-fan and @davies, committers who have all reviewed or submitted changes related to this. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-03 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 Looks like SparkR unit tests have been failing for all or most PRs after [this commit.](https://github.com/apache/spark/commit/48aafeda7db879491ed36fff89d59ca7ec3136fa) --- If your project is set

[GitHub] spark pull request #16797: [SPARK-19455][SQL] Add option for case-insensitiv...

2017-02-03 Thread budde
GitHub user budde opened a pull request: https://github.com/apache/spark/pull/16797 [SPARK-19455][SQL] Add option for case-insensitive Parquet field resolution ## What changes were proposed in this pull request? **Summary** - Add

[GitHub] spark pull request #16797: [SPARK-19455][SQL] Add option for case-insensitiv...

2017-02-03 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16797#discussion_r99456138 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala --- @@ -268,13 +292,23 @@ private[parquet

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-01-31 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Pinging @tdas on this-- looks like you're the committer who has contributed the most to kinesis-asl. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-06 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99718950 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala --- @@ -35,10 +36,65 @@ import

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-06 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 PR has been amended to reflect feedback. Thanks for taking a look, @brkyvz. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-06 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 > Should we roll these behaviors into one flag? e.g. ```spark.sql.hive.mixedCaseSchemaSupport``` That sounds reasonable to me. The only thing I wonder about is if there's any use case wh

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-06 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99718545 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisExampleUtils.scala --- @@ -0,0 +1,22 @@ +/* + * Licensed

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-06 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 > how about we add a new SQL command to refresh the table schema in metastore by inferring schema with data files? This is a compatibility issue and we should have provided a way for users to migr

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-06 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 > BTW, what behavior do we expect if a parquet file has two columns whose lower-cased names are identical? I can take a look at how Spark handled this prior to 2.1, although I'm not s

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-04 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 Bringing back schema inference is certainly a much cleaner option, although I imagine doing this in the old manner would negate the performance improvements brought by #14690 for any non-Spark 2.1

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-07 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 > Can we write such schema (conflicting columns after lower-casing) into metastore? I think the scenario here would be that the metastore contains a single lower-case column name that co

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99908125 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -449,22 +935,48 @@ private class

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99908239 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala --- @@ -34,11 +35,56 @@ import

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-07 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Amending PR per review feedback. Issue around using optional stsExternalId argument in ```KinesisUtils.createStream()``` remains open. --- If your project is set up for it, you can reply

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99907831 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -449,22 +935,48 @@ private class

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99906733 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -123,9 +123,143 @@ object KinesisUtils

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99909144 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -123,9 +123,143 @@ object KinesisUtils

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-08 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Amending the PR again to fix new dependency conflict in spark/pom.xml. Thanks again for taking the time to review this, @brkyvz and @srowen. Please let me know if you feel any additional changes

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-08 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Looks like Jenkins is failing to build any recent PR due to the following error: ```[error] Could not find hadoop2.3 in the list. Valid options are ['hadoop2.6', 'hadoop2.7']``` I

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-08 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 > For better user experience, we should automatically infer the schema and write it back to metastore, if there is no case-sensitive table schema in metastore. This has the cost of detection the n

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-06 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Amending this PR to upgrade the KCL/AWS SDK dependencies to more-current versions (1.7.3 and 1.11.76, respectively). The ```RegionUtils.getRegionByEndpoint()``` API was removed from the SDK, so I've

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-06 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 I'll double check, but I don't think ```spark.sql.hive.manageFilesourcePartitions=false``` would solve this issue since we're still deriving the file relation's dataSchema parameter from the schema

[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-07 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16797 > is it a completely compatibility issue? Seems like the only problem is, when we write out mixed-case-schema parquet files directly, and create an external table pointing to these files with Sp

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99905577 --- Diff: external/kinesis-asl/src/test/scala/org/apache/spark/streaming/kinesis/KinesisReceiverSuite.scala --- @@ -62,9 +62,20 @@ class KinesisReceiverSuite

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99905664 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -123,9 +123,143 @@ object KinesisUtils

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99905600 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala --- @@ -23,7 +23,8 @@ import

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-07 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r99905835 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala --- @@ -34,11 +35,56 @@ import

[GitHub] spark pull request #16797: [SPARK-19455][SQL] Add option for case-insensitiv...

2017-02-03 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16797#discussion_r99455967 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -249,10 +249,18 @@ object SQLConf { val

[GitHub] spark pull request #16797: [SPARK-19455][SQL] Add option for case-insensitiv...

2017-02-03 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16797#discussion_r99458106 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala --- @@ -268,13 +292,23 @@ private[parquet

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-01 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Pinging @zsxwing and @srowen, additional committers who have previously reviewed kinesis-asl changes. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-01 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 There shouldn't be any change to behavior or compatibility when using the existing implementations of ```KinesisUtils.createStream()```. Only drawback I can think of is this is making

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-01-30 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 The JIRA I opended for this issue contains further details and background. Linking to it here for good measure: * https://issues.apache.org/jira/browse/SPARK-19405 --- If your project

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-01-30 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Missed the code in python/streaming that this touches. Will update PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-01-30 Thread budde
GitHub user budde opened a pull request: https://github.com/apache/spark/pull/16744 [SPARK-19405][STREAMING] Support for cross-account Kinesis reads via STS - Add dependency on aws-java-sdk-sts - Replace SerializableAWSCredentials with new SerializableKCLAuthProvider class

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-01-30 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Also, on another note, the ```SerializableKCLAuthProvider``` class that **SparkQA** is identifying as a new public class is actually package private and replaced another package private class

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-15 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101460842 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,23 +161,49 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-15 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101461535 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,23 +161,49 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-15 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101461155 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala --- @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-15 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 Looks like I missed a Catalyst test. Updating the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-15 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101461357 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala --- @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-15 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101460565 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -296,6 +296,17 @@ object SQLConf { .longConf

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-21 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r102356422 --- Diff: python/pyspark/streaming/kinesis.py --- @@ -67,6 +68,12 @@ def createStream(ssc, kinesisAppName, streamName, endpointUrl, regionName

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-21 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 Updated the PR. Thanks for the work you've done on this! Hopefully I can have a PR for the builder interface up later this week. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-21 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r102366429 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/SerializableCredentialsProvider.scala --- @@ -0,0 +1,85

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-21 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 Pinging participants from #16797 once more to get any feedback on the new proposal: @gatorsmile, @viirya, @ericl, @mallman and @cloud-fan --- If your project is set up for it, you can reply

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-21 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 So, if these values are ```null``` we'll still be passing them to construct a ```BasicCredentialsProvider``` to pass as ```STSCredentialsProvider.longLivedCredentialsProvider```. I could add a check

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-21 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 @brkyvz I actually think that Scaladoc may be outdated– I double checked the current master branch and it looks like ```KinesisUtils.createStream()``` will still provide Some

[GitHub] spark pull request #16744: [SPARK-19405][STREAMING] Support for cross-accoun...

2017-02-21 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16744#discussion_r102351716 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/SerializableCredentialsProvider.scala --- @@ -0,0 +1,85

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-18 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101908105 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala --- @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-18 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r101908155 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala --- @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-20 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 @brkyvz Just for clarification, can this PR be merged as-is with a separate Jira/PR for adding a builder interface or is the builder interface a prerequisite for merging this? --- If your project

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-19 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-17 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 @brkyvz Fair enough. Let me know if there's anything I can do to help get this merged. I can also take a look at adding a builder class for Kinesis streams as a separate PR before the code freeze

[GitHub] spark issue #16744: [SPARK-19405][STREAMING] Support for cross-account Kines...

2017-02-16 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16744 @brkyvz, @zsxwing – Any update here? Worried that this PR is starting to languish. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102548681 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -690,10 +696,10 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-23 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102859179 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,45 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-23 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102850496 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -296,6 +296,25 @@ object SQLConf { .longConf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-23 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102850475 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,45 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-23 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 Updated per feedback from @ericl: - Added comment with additional context to ```HIVE_CASE_SENSITIVE_INFERENCE``` in SQLConf.scala - Removed default value test

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 @ericl: fixed the param doc string and tried to clean up ```createLogicalRelation()``` as you suggested. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 Thanks, @ericl. Is there anybody else you'd suggest pinging to take a look at this and ultimately get it merged? Re-pinging @viirya to review latest updates addressing his previous feedback

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103027907 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSchemaInferenceSuite.scala --- @@ -0,0 +1,192 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #16944: [SPARK-19611][SQL] Introduce configurable table schema i...

2017-02-24 Thread budde
Github user budde commented on the issue: https://github.com/apache/spark/pull/16944 The ```assert()``` statements added to ```setupCaseSensitiveTable()``` in **HiveSchemaInferenceSuite** per earlier feedback got squashed somewhere in the course of updating this PR. I've added them

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103050080 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,51 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103051801 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -226,6 +258,41 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103049838 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -296,6 +296,25 @@ object SQLConf { .longConf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103050192 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -226,6 +258,41 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103049359 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfSuite.scala --- @@ -21,6 +21,7 @@ import org.apache.hadoop.fs.Path import

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103049381 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -296,6 +296,25 @@ object SQLConf { .longConf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103051158 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -226,6 +258,41 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103050652 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -510,8 +510,13 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102554012 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,70 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102554021 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,70 @@ private[hive] class HiveMetastoreCatalog

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102547965 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -296,6 +296,21 @@ object SQLConf { .longConf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102554155 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -296,6 +296,21 @@ object SQLConf { .longConf

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102553568 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -181,7 +186,8 @@ case class CatalogTable

[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-22 Thread budde
Github user budde commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r102554381 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,70 @@ private[hive] class HiveMetastoreCatalog

  1   2   3   4   >