[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-122529489 @liancheng sure, I just wasn't sure if it should be closed :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer closed the pull request at: https://github.com/apache/spark/pull/6796 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-122529136 @rtreffer Since #7455 supersedes this PR, would you mind to close this one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120858057 [Test build #1055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1055/console) for PR 6796 at commit [`3e30bdf`](https://github.com/apache/spark/commit/3e30bdfb1199a105a882dde7d2dc0bd8edea05a2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121009125 [Test build #37143 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37143/console) for PR 6796 at commit [`1703c26`](https://github.com/apache/spark/commit/1703c26f917c5e06f60bc9c8cd9299c9ffbb2389). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121009163 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121017435 [Test build #37147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37147/consoleFull) for PR 6796 at commit [`1dad677`](https://github.com/apache/spark/commit/1dad677449445c878d7d938192df4a6b2d997db4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121024334 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121016709 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121024313 [Test build #37147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37147/console) for PR 6796 at commit [`1dad677`](https://github.com/apache/spark/commit/1dad677449445c878d7d938192df4a6b2d997db4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121016736 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121021867 [Test build #37146 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37146/console) for PR 6796 at commit [`83ca029`](https://github.com/apache/spark/commit/83ca029b2ec6e940f73acf9da0eae34319baeb6b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121021895 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121013563 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121013639 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121015081 [Test build #37146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37146/consoleFull) for PR 6796 at commit [`83ca029`](https://github.com/apache/spark/commit/83ca029b2ec6e940f73acf9da0eae34319baeb6b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121001152 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121001183 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-121002347 [Test build #37143 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37143/consoleFull) for PR 6796 at commit [`1703c26`](https://github.com/apache/spark/commit/1703c26f917c5e06f60bc9c8cd9299c9ffbb2389). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120830562 [Test build #1055 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1055/consoleFull) for PR 6796 at commit [`3e30bdf`](https://github.com/apache/spark/commit/3e30bdfb1199a105a882dde7d2dc0bd8edea05a2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120840863 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120840849 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120841470 [Test build #37130 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37130/consoleFull) for PR 6796 at commit [`c8d4d6c`](https://github.com/apache/spark/commit/c8d4d6c9f0e420b2bd54e358b6b73f198ef3373e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120845430 [Test build #37130 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37130/console) for PR 6796 at commit [`c8d4d6c`](https://github.com/apache/spark/commit/c8d4d6c9f0e420b2bd54e358b6b73f198ef3373e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120845445 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120751119 [Test build #37099 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37099/console) for PR 6796 at commit [`3e30bdf`](https://github.com/apache/spark/commit/3e30bdfb1199a105a882dde7d2dc0bd8edea05a2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120751126 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120740815 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120740808 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120741423 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120741421 [Test build #37097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37097/console) for PR 6796 at commit [`1152721`](https://github.com/apache/spark/commit/1152721ebeafe7a4535e202c3091f415a8ba3863). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120749766 [Test build #37099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37099/consoleFull) for PR 6796 at commit [`3e30bdf`](https://github.com/apache/spark/commit/3e30bdfb1199a105a882dde7d2dc0bd8edea05a2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120741354 [Test build #37097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37097/consoleFull) for PR 6796 at commit [`1152721`](https://github.com/apache/spark/commit/1152721ebeafe7a4535e202c3091f415a8ba3863). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120749709 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-120749716 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119297130 [Test build #36702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36702/consoleFull) for PR 6796 at commit [`e6dad45`](https://github.com/apache/spark/commit/e6dad4574f47a7b6694500df2a1b86037c86). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119298855 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119298914 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119297921 [Test build #36702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36702/console) for PR 6796 at commit [`e6dad45`](https://github.com/apache/spark/commit/e6dad4574f47a7b6694500df2a1b86037c86). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119297928 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119301151 The writeDecimal method is rather ugly, and the write path needs to know if we follow parquet style or not as this implies a different encoding (addInteger / addLong). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119299539 [Test build #36703 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36703/consoleFull) for PR 6796 at commit [`7a57c16`](https://github.com/apache/spark/commit/7a57c163ec3fe516d3b173042329a9b6b135efa9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119296469 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119296488 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119329021 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119328979 [Test build #36703 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36703/console) for PR 6796 at commit [`7a57c16`](https://github.com/apache/spark/commit/7a57c163ec3fe516d3b173042329a9b6b135efa9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-119170768 Hi @liancheng, I'm rebasing on you PR right now. I can work for ~1-2h / day on this PR so feel free to take over the PR if this blocks anything. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-118730215 Hey @rtreffer, just want to make sure whether you are still working on this? I'm asking because I just opened #7231 to refactor Parquet read path for interoperability and backwards-compatibility, which also touches the decimal parts. I believe the new [`CatalystDecimalConverter`] [1] already covers the read path of decimals with precision 18, which means this PR can be further simplified. Just in case you don't have time to continue this PR, I'm happy to fork your branch and get it merged (will still list you as the main author). [1]: https://github.com/apache/spark/pull/7231/files#diff-1d6c363c04155a9328fe1f5bd08a2f90R237 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33487350 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - --- End diff -- (Please see my comments [here] [1].) [1]: https://github.com/apache/spark/pull/6796/files#diff-83ef4d5f1029c8bebb49a0c139fa3154R301 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33487385 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -169,11 +169,12 @@ private[parquet] class CatalystSchemaConverter( } case INT96 = -CatalystSchemaConverter.analysisRequire( - assumeInt96IsTimestamp, - INT96 is not supported unless it's interpreted as timestamp. + -sPlease try to set ${SQLConf.PARQUET_INT96_AS_TIMESTAMP.key} to true.) -TimestampType +field.getOriginalType match { + case DECIMAL = makeDecimalType(maxPrecisionForBytes(12)) + case _ if assumeInt96IsTimestamp = TimestampType + case null = makeDecimalType(maxPrecisionForBytes(12)) + case _ = illegalType() +} --- End diff -- Yeah, it's not mentioned anywhere, just got this information from Parquet dev mailing list :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33486747 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -43,16 +43,27 @@ private[parquet] object ParquetTypesConverter extends Logging { } /** - * Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision. + * BYTES_FOR_PRECISION computes the required bytes to store a value of a certain decimal + * precision. */ - private[parquet] val BYTES_FOR_PRECISION = Array.tabulate[Int](38) { precision = -var length = 1 + private[parquet] def BYTES_FOR_PRECISION_COMPUTE(precision : Int) : Int = { +var length = (precision / math.log10(2) - 1).toInt / 8 while (math.pow(2.0, 8 * length - 1) math.pow(10.0, precision)) { length += 1 } length } + private[parquet] def BYTES_FOR_PRECISION_STATIC = --- End diff -- Prefer `bytesForPrecisionStatic`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33487239 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala --- @@ -369,9 +371,6 @@ private[parquet] class MutableRowWriteSupport extends RowWriteSupport { case DateType = writer.addInteger(record.getInt(index)) case TimestampType = writeTimestamp(record.getLong(index)) case d: DecimalType = -if (d.precisionInfo == None || d.precisionInfo.get.precision 18) { - sys.error(sUnsupported datatype $d, cannot write to consumer) -} --- End diff -- Had an offline discussion with @yhuai but forgot to post a summary here: at last we decided not to convert unlimited decimals to `decimal(10, 0)` implicitly in #6617, because firstly we need to confirm all other parts works in a consistent way, which might introduce unexpected complexity in #6617, and secondly implicit conversion can often become a huge footgun. So let's still report an error in case of `d.precisionInfo == None` (but please throw an `AnalysisException` instead of using `sys.error`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33486665 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -43,16 +43,27 @@ private[parquet] object ParquetTypesConverter extends Logging { } /** - * Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision. + * BYTES_FOR_PRECISION computes the required bytes to store a value of a certain decimal + * precision. */ - private[parquet] val BYTES_FOR_PRECISION = Array.tabulate[Int](38) { precision = -var length = 1 + private[parquet] def BYTES_FOR_PRECISION_COMPUTE(precision : Int) : Int = { +var length = (precision / math.log10(2) - 1).toInt / 8 while (math.pow(2.0, 8 * length - 1) math.pow(10.0, precision)) { length += 1 } length } + private[parquet] def BYTES_FOR_PRECISION_STATIC = +(0 to 30).map(BYTES_FOR_PRECISION_COMPUTE).toArray --- End diff -- 30 should probably be replaced with 38, which fits in 16 bytes, and is the maximum precision supported in Hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33486716 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -43,16 +43,27 @@ private[parquet] object ParquetTypesConverter extends Logging { } /** - * Compute the FIXED_LEN_BYTE_ARRAY length needed to represent a given DECIMAL precision. + * BYTES_FOR_PRECISION computes the required bytes to store a value of a certain decimal + * precision. */ - private[parquet] val BYTES_FOR_PRECISION = Array.tabulate[Int](38) { precision = -var length = 1 + private[parquet] def BYTES_FOR_PRECISION_COMPUTE(precision : Int) : Int = { --- End diff -- Prefer `bytesForPrecision` since it's a method instead of a constant. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-116766442 Hi @rtreffer, - How is the compatibility mode intended to work? Settings are currently private, but I'd like to store Decimal(19), so is lifting the 18 limit correct for compatibility mode? The compatibility mode is enabled by setting `spark.sql.parquet.followParquetFormatSpec` to `false`. This mode must be enabled for now, because the write path hasn't been refactored to follow the Parquet format spec. Note that compatibility mode only affects the write path, because the Parquet format spec also covers legacy formats by various backwards-compatibilty rules. Decimals with precision 18 could be enabled even in compatibility mode. Because it doesn't affect compatibility: old Spark versions can't read decimals with precision 18 from the very beginning. What do you mean by saying settings are currently private? `SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC` is `private[spark]`, all classes under `org.apache.spark` can access it. - INT32/INT64 are only used when the byte length matches the byte length for the precision. FIXED_LEN_BYTE_ARRAY will thus e.g. be used to store 6 byte values I see your point. You mentioned a debate in [this comment] [1], were you referring to [this one] [2]? From the perspective of storage efficiency, it probably makes sense. (I said probably because I'm not quite sure about the average case after taking encoding/compression into consideration.) However, in the case of Parquet, we usually care more about speed and memory consumption. Especially, Parquet can be super memory consuming when reading files with wide schema (i.e., large column number). A key advantage of `INT32` and `INT64` is that, they avoid boxing costs in many cases and thus can be faster and use less memory. Also, you don't need to do all those bit operations to encode/decode the unscaled long value of a decimal when using `INT32` and `INT64`. At the meantime, Parquet handles `INT32` and `INT64` pretty efficiently. There are more encoders for integral types than binaries (either fixed-length or not, see [Encodings.md] [3] for more details). Although I haven't done benchmark for this, but I believe in many cases, storage efficiency of `INT32` can be comparable or even better than `FIXED_LEN_BYTE_ARRAY` with a length less than 4. The same should also applies to `INT64`. So I suggest: when compatibility mode is off, we just use `INT32` for 1 = precision = 9, and `INT64` for 10 = precision = 18 when converting `DecimalType`s in `CatalystSchemaConverter`. When we refactor the write path to follow Parquet format spec, we can write decimals in `INT32` and `INT64` when appropriate in follow-up PRs. The TL;DR is: I'd just remove `precision = maxPrecisionForBytes(8)` in [this line] [4] and leave everything else unmodified (you comment updates looks good to me though :) - FIXED_LEN_BYTE_ARRAY means I'll have to create an array of the correct size. I've increased the scratch_bytes. Not very happy about the code path, do you have better ideas? Hive limits the max precision of a decimal to 38, which fits in 16 bytes. So 16 rather than 4096 bytes should be enough for most cases. Also it would be better to refactor branches of [this `if` expression] [5] into two separate methods for clarity. Otherwise it looks good. - BYTES_FOR_PRECISION needs to handle any precision. I've reworked that code. Again, suggestions welcome (See my other comments inlined.) [1]: https://github.com/apache/spark/pull/6796#discussion_r33420742 [2]: https://github.com/apache/spark/pull/6796#discussion_r32891515 [3]: https://github.com/Parquet/parquet-format/blob/master/Encodings.md [4]: https://github.com/apache/spark/pull/6796/files#diff-a4c01298c63223d113645a31c01141baL377 [5]: https://github.com/apache/spark/pull/6796/files#diff-83ef4d5f1029c8bebb49a0c139fa3154R301 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-116769197 Hi @liancheng, thank you for the thorough review, will push a reworked version soon. Everything sounds reasonable :-) With private Settings I meant that I can't change the setting in the shell because it's marked as isPublic = false in https://github.com/liancheng/spark/blob/2a2062d3f530ecd26e75b306aee42761d67d8724/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala#L273 I'm not sure if that's intended. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33487973 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - // = // Decimals (follow Parquet format spec) // = - // Uses INT32 for 1 = precision = 9 + // Uses INT32 for 4 byte encodings / precision = 9 case DecimalType.Fixed(precision, scale) -if precision = maxPrecisionForBytes(4) followParquetFormatSpec = +if followParquetFormatSpec maxPrecisionForBytes(3) precision + precision = maxPrecisionForBytes(4) = --- End diff -- (Please see my comment [here] [1].) [1]: https://github.com/apache/spark/pull/6796#issuecomment-116766442 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-116879298 @rtreffer Yeah, it's intended. As explained above, this feature flag must be set to `false` for now because the write path hasn't been refactored to respect the Parquet format spec. If we turn this on, `CatalystSchemaConverter` will generate standard Parquet schema while the write path still writes data conforming to the old legacy format, which leads to data corruption. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420570 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - --- End diff -- We still need this branch to handle the case where precision information is missing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420649 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -404,9 +401,10 @@ private[parquet] class CatalystSchemaConverter( .scale(scale) .named(field.name) - // Uses INT64 for 1 = precision = 18 + // Uses INT64 for 8 byte encodings / precision = 18 case DecimalType.Fixed(precision, scale) -if precision = maxPrecisionForBytes(8) followParquetFormatSpec = +if followParquetFormatSpec maxPrecisionForBytes(7) precision + precision = maxPrecisionForBytes(8) = --- End diff -- Same question as above... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420652 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -562,4 +560,5 @@ private[parquet] object CatalystSchemaConverter { throw new AnalysisException(message) } } + --- End diff -- Nit: Remove this newline. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420661 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala --- @@ -212,10 +212,7 @@ private[parquet] class RowWriteSupport extends WriteSupport[InternalRow] with Lo case BooleanType = writer.addBoolean(value.asInstanceOf[Boolean]) case DateType = writer.addInteger(value.asInstanceOf[Int]) case d: DecimalType = - if (d.precisionInfo == None || d.precisionInfo.get.precision 18) { -sys.error(sUnsupported datatype $d, cannot write to consumer) - } - writeDecimal(value.asInstanceOf[Decimal], d.precisionInfo.get.precision) + writeDecimal(value.asInstanceOf[Decimal], d.precisionInfo.map(_.precision).getOrElse(10)) --- End diff -- Need to report error for `DecimalType(None)`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420742 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - // = // Decimals (follow Parquet format spec) // = - // Uses INT32 for 1 = precision = 9 + // Uses INT32 for 4 byte encodings / precision = 9 case DecimalType.Fixed(precision, scale) -if precision = maxPrecisionForBytes(4) followParquetFormatSpec = +if followParquetFormatSpec maxPrecisionForBytes(3) precision + precision = maxPrecisionForBytes(4) = --- End diff -- We had a debate about using the most compact storage type if possible. As such INT32 looses compared to a 3 byte fixed length array. Am 28. Juni 2015 10:59:15 MESZ, schrieb Cheng Lian notificati...@github.com: case DecimalType.Fixed(precision, scale) -if precision = maxPrecisionForBytes(4) followParquetFormatSpec = +if followParquetFormatSpec maxPrecisionForBytes(3) precision + precision = maxPrecisionForBytes(4) = Why do we want `maxPrecisionForBytes(3) precision` here? Did I miss something? --- Reply to this email directly or view it on GitHub: https://github.com/apache/spark/pull/6796/files#r33420647 -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420609 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -169,11 +169,12 @@ private[parquet] class CatalystSchemaConverter( } case INT96 = -CatalystSchemaConverter.analysisRequire( - assumeInt96IsTimestamp, - INT96 is not supported unless it's interpreted as timestamp. + -sPlease try to set ${SQLConf.PARQUET_INT96_AS_TIMESTAMP.key} to true.) -TimestampType +field.getOriginalType match { + case DECIMAL = makeDecimalType(maxPrecisionForBytes(12)) + case _ if assumeInt96IsTimestamp = TimestampType + case null = makeDecimalType(maxPrecisionForBytes(12)) + case _ = illegalType() +} --- End diff -- `INT96` is only used for nanosecond timestamp types for historical reasons, and is to be deprecated. Let's not use it for decimals. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420628 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -373,8 +374,10 @@ private[parquet] class CatalystSchemaConverter( // Spark 1.4.x and prior versions only support decimals with a maximum precision of 18 and // always store decimals in fixed-length byte arrays. + // Always storing FIXED_LEN_BYTE_ARRAY is thus compatible with spark = 1.4.x, except for + // precisions 18. case DecimalType.Fixed(precision, scale) -if precision = maxPrecisionForBytes(8) !followParquetFormatSpec = +if !followParquetFormatSpec = --- End diff -- Nit: Let's join this line and the line above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420647 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - // = // Decimals (follow Parquet format spec) // = - // Uses INT32 for 1 = precision = 9 + // Uses INT32 for 4 byte encodings / precision = 9 case DecimalType.Fixed(precision, scale) -if precision = maxPrecisionForBytes(4) followParquetFormatSpec = +if followParquetFormatSpec maxPrecisionForBytes(3) precision + precision = maxPrecisionForBytes(4) = --- End diff -- Why do we want `maxPrecisionForBytes(3) precision` here? Did I miss something? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - --- End diff -- You said it should usevthe hive default of (10,0) - or did I misinterpret that? Am 28. Juni 2015 10:53:00 MESZ, schrieb Cheng Lian notificati...@github.com: @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter( .length(minBytesForPrecision(precision)) .named(field.name) - case dec @ DecimalType() if !followParquetFormatSpec = -throw new AnalysisException( - sData type $dec is not supported. + -sWhen ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is set to false, + -decimal precision and scale must be specified, + -and precision must be less than or equal to 18.) - We still need this branch to handle the case where precision information is missing. --- Reply to this email directly or view it on GitHub: https://github.com/apache/spark/pull/6796/files#r33420570 -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-116231004 @rtreffer Thanks for rebasing and simplifying this! I left some comments but haven't finished my review, will be back after confirming some details related to your questions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33420719 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala --- @@ -169,11 +169,12 @@ private[parquet] class CatalystSchemaConverter( } case INT96 = -CatalystSchemaConverter.analysisRequire( - assumeInt96IsTimestamp, - INT96 is not supported unless it's interpreted as timestamp. + -sPlease try to set ${SQLConf.PARQUET_INT96_AS_TIMESTAMP.key} to true.) -TimestampType +field.getOriginalType match { + case DECIMAL = makeDecimalType(maxPrecisionForBytes(12)) + case _ if assumeInt96IsTimestamp = TimestampType + case null = makeDecimalType(maxPrecisionForBytes(12)) + case _ = illegalType() +} --- End diff -- Didn't know about the deprecation, will drop it. Am 28. Juni 2015 10:56:00 MESZ, schrieb Cheng Lian notificati...@github.com: @@ -169,11 +169,12 @@ private[parquet] class CatalystSchemaConverter( } case INT96 = -CatalystSchemaConverter.analysisRequire( - assumeInt96IsTimestamp, - INT96 is not supported unless it's interpreted as timestamp. + -sPlease try to set ${SQLConf.PARQUET_INT96_AS_TIMESTAMP.key} to true.) -TimestampType +field.getOriginalType match { + case DECIMAL = makeDecimalType(maxPrecisionForBytes(12)) + case _ if assumeInt96IsTimestamp = TimestampType + case null = makeDecimalType(maxPrecisionForBytes(12)) + case _ = illegalType() +} `INT96` is only used for nanosecond timestamp types for historical reasons, and is to be deprecated. Let's not use it for decimals. --- Reply to this email directly or view it on GitHub: https://github.com/apache/spark/pull/6796/files#r33420609 -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115619981 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115619879 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115620346 [Test build #35855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35855/consoleFull) for PR 6796 at commit [`5fe321e`](https://github.com/apache/spark/commit/5fe321ee027570eea49869bcbe80c55246538229). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115621136 [Test build #35855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35855/console) for PR 6796 at commit [`5fe321e`](https://github.com/apache/spark/commit/5fe321ee027570eea49869bcbe80c55246538229). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115621149 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115620818 @liancheng it starts to work (compiles and minimal initial test worked, no guarantees). I think there are some points that need feedback - How is the compatibility mode intended to work? Settings are currently private, but I'd like to store Decimal(19), so is lifting the 18 limit correct for compatibility mode? - INT32/INT64 are only used when the byte length matches the byte length for the precision. FIXED_LEN_BYTE_ARRAY will thus e.g. be used to store 6 byte values - FIXED_LEN_BYTE_ARRAY means I'll have to create an array of the correct size. I've increased the scratch_bytes. Not very happy about the code path, do you have better ideas? - BYTES_FOR_PRECISION needs to handle any precision. I've reworked that code. Again, suggestions welcome The patch is now way smaller and less intrusive. Looks like the refactoring was well worth the effort! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user sujkh85 commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115389042 NAVER - http://www.naver.com/ su...@naver.com ëê» ë³´ë´ì ë©ì¼ Re: [spark] [SPARK-4176][WIP] Support decimal types with precision 18 in parquet (#6796) ì´ ë¤ìê³¼ ê°ì ì´ì ë¡ ì ì¡ ì¤í¨íìµëë¤. ë°ë ì¬ëì´ íìëì ë©ì¼ì ìì ì°¨ë¨ íììµëë¤. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-11533 Currently reworking the patch. Here is the warning about the tuple match ``` [warn] /home/rtreffer/work/spark-master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala:334: object Fixed expects 2 patterns to hold (Int, Int) but crushing into 2-tuple to fit single pattern (SI-6675) ``` According to the ticket it's a deprecation warning. https://issues.scala-lang.org/browse/SI-6675 Nothing urgent, but I think it should be fixed at some point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33124241 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala --- @@ -369,9 +371,6 @@ private[parquet] class MutableRowWriteSupport extends RowWriteSupport { case DateType = writer.addInteger(record.getInt(index)) case TimestampType = writeTimestamp(record.getLong(index)) case d: DecimalType = -if (d.precisionInfo == None || d.precisionInfo.get.precision 18) { - sys.error(sUnsupported datatype $d, cannot write to consumer) -} --- End diff -- Overseen, bug. We can't serialize without that info, parquet requires it (and mixed scale would be complicated). PS: do you know if there is any interest on allowing mixed Decimal in parquet? Am 24. Juni 2015 09:36:11 MESZ, schrieb Cheng Lian notificati...@github.com: @@ -369,9 +371,6 @@ private[parquet] class MutableRowWriteSupport extends RowWriteSupport { case DateType = writer.addInteger(record.getInt(index)) case TimestampType = writeTimestamp(record.getLong(index)) case d: DecimalType = -if (d.precisionInfo == None || d.precisionInfo.get.precision 18) { - sys.error(sUnsupported datatype $d, cannot write to consumer) -} Don't we need to consider the case where `d.precisionInfo == None` now? --- Reply to this email directly or view it on GitHub: https://github.com/apache/spark/pull/6796/files#r33123740 -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33123740 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala --- @@ -369,9 +371,6 @@ private[parquet] class MutableRowWriteSupport extends RowWriteSupport { case DateType = writer.addInteger(record.getInt(index)) case TimestampType = writeTimestamp(record.getLong(index)) case d: DecimalType = -if (d.precisionInfo == None || d.precisionInfo.get.precision 18) { - sys.error(sUnsupported datatype $d, cannot write to consumer) -} --- End diff -- Don't we need to consider the case where `d.precisionInfo == None` now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33129732 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala --- @@ -331,7 +331,7 @@ private[sql] class JDBCRDD( case BooleanType = BooleanConversion case DateType = DateConversion case DecimalType.Unlimited = DecimalConversion(None) - case DecimalType.Fixed(d) = DecimalConversion(Some(d)) + case DecimalType.Fixed(d, s) = DecimalConversion(Some((d, s))) --- End diff -- As said it was only about a warning, not about correctness. I'll drop this change on the next version, it draws too much attention and is not needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33123578 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -289,7 +295,13 @@ private[parquet] object ParquetTypesConverter extends Logging { name: String, nullable: Boolean = true, inArray: Boolean = false, + parquetSchema: Option[ParquetType] = None, toThriftSchemaNames: Boolean = false): ParquetType = { + +val parquetElementTypeBySchema = parquetSchema.collect { +case gType : ParquetGroupType if (gType.containsField(name)) = gType.getType(name) --- End diff -- Nit: Remove the space before `:` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33128740 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetIOSuite.scala --- @@ -108,7 +108,7 @@ class ParquetIOSuiteBase extends QueryTest with ParquetTest { // Parquet doesn't allow column names with spaces, have to add an alias here .select($_1 cast decimal as dec) -for ((precision, scale) - Seq((5, 2), (1, 0), (1, 1), (18, 10), (18, 17))) { +for ((precision, scale) - Seq((5, 2), (1, 0), (1, 1), (18, 10), (18, 17), (60, 5))) { --- End diff -- It would be good to add one more edge case here, namely `(19, n)`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33119772 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -229,11 +231,15 @@ private[parquet] object ParquetTypesConverter extends Logging { case LongType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT64)) case TimestampType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT96)) case DecimalType.Fixed(precision, scale) if precision = 18 = - // TODO: for now, our writer only supports decimals that fit in a Long Some(ParquetTypeInfo(ParquetPrimitiveTypeName.FIXED_LEN_BYTE_ARRAY, --- End diff -- Using int32 and int64 makes encoding and decoding faster since they don't introduce boxing costs. But I agree that should be made in another PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33133591 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -229,11 +231,15 @@ private[parquet] object ParquetTypesConverter extends Logging { case LongType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT64)) case TimestampType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT96)) case DecimalType.Fixed(precision, scale) if precision = 18 = - // TODO: for now, our writer only supports decimals that fit in a Long Some(ParquetTypeInfo(ParquetPrimitiveTypeName.FIXED_LEN_BYTE_ARRAY, Some(ParquetOriginalType.DECIMAL), Some(new DecimalMetadata(precision, scale)), Some(BYTES_FOR_PRECISION(precision +case DecimalType.Fixed(precision, scale) = + Some(ParquetTypeInfo(ParquetPrimitiveTypeName.BINARY, --- End diff -- Under the assumption that all values will use the full length, yes. But at some point the overhead of the length is low compared to the overhead if someone specifies just the upper bound of values. I have to check if it really uses 4 bytes for BINARY. I'd then raise the threshold to ~40 bytes length. (meaning =10% worst case overhead before compression) It won't simplify the decoding/writing though, because the =18 case is used for long decoding. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33132365 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -229,11 +231,15 @@ private[parquet] object ParquetTypesConverter extends Logging { case LongType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT64)) case TimestampType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT96)) case DecimalType.Fixed(precision, scale) if precision = 18 = - // TODO: for now, our writer only supports decimals that fit in a Long Some(ParquetTypeInfo(ParquetPrimitiveTypeName.FIXED_LEN_BYTE_ARRAY, Some(ParquetOriginalType.DECIMAL), Some(new DecimalMetadata(precision, scale)), Some(BYTES_FOR_PRECISION(precision +case DecimalType.Fixed(precision, scale) = + Some(ParquetTypeInfo(ParquetPrimitiveTypeName.BINARY, --- End diff -- Using `BINARY` here conforms to Parquet format spec. But according to the spec, `FIXED_LENGTH_BYTE_ARRAY` with different length can also be used to store decimals with different precisions. From the perspective of storage efficiency, `FIXED_LENGTH_BYTE_ARRAY` is probably more preferable, since `BINARY` has variable length and needs 4 extra bytes to encode the length (before being encoded and compressed). Another benefit here is that we can just unify cases for precision = 18 and precision 18. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114835713 @liancheng I'll rebase on your branch, I really like the way you cleaned up toPrimitiveDataType by using a fluent Types interface. This will make this patch way easier. Talking about testing/compatibility/interoperability, I have added a hive-generated parquet file that I'd like to turn into a test case: https://github.com/rtreffer/spark/tree/spark-4176-store-large-decimal-in-parquet/sql/core/src/test/resources/hive-decimal-parquet There are some parquet files attached to tickets in jira, too. Do you plan to convert those into tests? Regarding FIXED_LENGTH_BYTE_ARRAY The overhead would decreases compared to size. BINARY overhead would be 10% from ~DECIMAL(100) and 25% from ~DECIAL(40) (pre-compression). I'd expect DECIMAL(40) to use the full precision only from time to time. But yeah, I've overlooked the 4 byte overhead at https://github.com/Parquet/parquet-format/blob/master/Encodings.md and assumed it would be less, FIXED_LENGTH_BYTE_ARRAY should be good for now (until s.o. complains). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114810864 @rtreffer I'm working on improving compatibility and interoperability of Spark SQL's Parquet support. The first part is #6617, where I refactored the scheme conversion code so that now we can stick to the most recent Parquet format spec (`ParquetTypes.scala` replaced with `CatalystSchemaConverter.scala`). Schema conversion part of the decimal precision problem is also handled there ([1] [1], [2] [2]). Would you mind if I merge that one and then you rebase this PR? I think it would be much easier to work with. Basically you only need to: 1. Remove the `precision = ...` part in [this line] [3], and 1. Always use `FIXED_LENGTH_BYTE_ARRAY` to store decimals [1]: https://github.com/apache/spark/pull/6617/files#diff-a4c01298c63223d113645a31c01141baR370 [2]: https://github.com/apache/spark/pull/6617/files#diff-a4c01298c63223d113645a31c01141baR118 [3]: https://github.com/apache/spark/pull/6617/files#diff-a4c01298c63223d113645a31c01141baR377 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33133479 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala --- @@ -369,9 +371,6 @@ private[parquet] class MutableRowWriteSupport extends RowWriteSupport { case DateType = writer.addInteger(record.getInt(index)) case TimestampType = writeTimestamp(record.getLong(index)) case d: DecimalType = -if (d.precisionInfo == None || d.precisionInfo.get.precision 18) { - sys.error(sUnsupported datatype $d, cannot write to consumer) -} --- End diff -- I don't think Parquet allow mixed decimal precision. However, Hive uses a default precision = 10 and a default scale = 0 when precision/scale information is missing. I also did the same thing in #6617. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-115025231 @rtreffer I've merged #6617. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114978147 @rtreffer Cool thanks! Then I'm merging #6617 shortly. And yes, I plan to convert those Parquet files into tests, probably just include them as test resources. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114532812 [Test build #35545 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35545/console) for PR 6796 at commit [`8ff6603`](https://github.com/apache/spark/commit/8ff660369601418c36ed82e9549bd676a83b8345). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114489076 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114532858 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114479672 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-114489864 [Test build #35545 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35545/consoleFull) for PR 6796 at commit [`8ff6603`](https://github.com/apache/spark/commit/8ff660369601418c36ed82e9549bd676a83b8345). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r33102495 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala --- @@ -331,7 +331,7 @@ private[sql] class JDBCRDD( case BooleanType = BooleanConversion case DateType = DateConversion case DecimalType.Unlimited = DecimalConversion(None) - case DecimalType.Fixed(d) = DecimalConversion(Some(d)) + case DecimalType.Fixed(d, s) = DecimalConversion(Some((d, s))) --- End diff -- This should be correct. Scala pattern extractors use tuples if they want to return multiple values. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user rtreffer commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r32892827 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -289,7 +295,20 @@ private[parquet] object ParquetTypesConverter extends Logging { name: String, nullable: Boolean = true, inArray: Boolean = false, + parquetSchema: Option[ParquetType] = None, toThriftSchemaNames: Boolean = false): ParquetType = { + +val parquetElementTypeBySchema = --- End diff -- It also performs a type check / conversion. That's why I've removed it. It would look like this ``` val parquetElementTypeBySchema = parquetSchema.filter(_.isInstanceOf[ParquetGroupType]).filter(_.containsField(name)).map(_.getType(name)) ``` I would settle on collect, does that look ok? ``` val parquetElementTypeBySchema = parquetSchema.collect { case gType : ParquetGroupType if (gType.containsField(name)) = gType.getType(name) } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r32894165 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -229,11 +231,15 @@ private[parquet] object ParquetTypesConverter extends Logging { case LongType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT64)) case TimestampType = Some(ParquetTypeInfo(ParquetPrimitiveTypeName.INT96)) case DecimalType.Fixed(precision, scale) if precision = 18 = - // TODO: for now, our writer only supports decimals that fit in a Long Some(ParquetTypeInfo(ParquetPrimitiveTypeName.FIXED_LEN_BYTE_ARRAY, --- End diff -- Make sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6796#discussion_r32894152 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala --- @@ -289,7 +295,20 @@ private[parquet] object ParquetTypesConverter extends Logging { name: String, nullable: Boolean = true, inArray: Boolean = false, + parquetSchema: Option[ParquetType] = None, toThriftSchemaNames: Boolean = false): ParquetType = { + +val parquetElementTypeBySchema = --- End diff -- This one is better --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6796#issuecomment-113950929 [Test build #35414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35414/consoleFull) for PR 6796 at commit [`f973b58`](https://github.com/apache/spark/commit/f973b582150f21a8d8e97937585cd11c407f29fc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org