Github user rxin commented on the issue:
https://github.com/apache/spark/pull/16781
Did we conduct any performance tests on this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
great! thanks @ueshin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
Thanks! Merging to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76556/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76556 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76556/testReport)**
for PR 16781 at commit
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
LGTM, pending Jenkins.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76556 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76556/testReport)**
for PR 16781 at commit
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76419/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76419 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76419/testReport)**
for PR 16781 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76419 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76419/testReport)**
for PR 16781 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76391/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76391 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76391/testReport)**
for PR 16781 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin sorry it took me a while to figure out how a table partitioned by
timestamps work (I didn't even realize that was possible, I don't think it is
in hive?) and I was traveling.
The
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76391 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76391/testReport)**
for PR 16781 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76141/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76141 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76141/testReport)**
for PR 16781 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin updated per your feedback.
I should have explained that the last update *did* handle partition tables
(it added the second call to `getStorageTzOptions` in `HiveMetastoreCatalog`),
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #76141 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76141/testReport)**
for PR 16781 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75966/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75966 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75966/testReport)**
for PR 16781 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin I've pushed an update which addresses your comments. I also
realized that partitioned tables weren't handled correctly! I fixed that as
well.
---
If your project is set up for it, you
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75966 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75966/testReport)**
for PR 16781 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin thanks for taking a look. Yes, that understanding is correct.
Another way to think about it is to compare those same operations with
different file formats, eg. textfile. Those work more
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
@squito I'm sorry for late reply.
I was reviewing this pr again and current Hive behavior deeply.
It seems the both behaviors are the same when the table property exists,
but this is a
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin thanks for taking a look earlier, sorry it has taken me some time
to update this.
Things to note since last time:
1) Hive has seen been updated in
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75686/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75686 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75686/testReport)**
for PR 16781 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75686 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75686/testReport)**
for PR 16781 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
thanks @ueshin ... I am going to chat w/ some folks involved in that hive
patch, that was not my understanding conceptually of their patch. I heard that
there is a bug they need to fix so maybe its
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
Btw, I noticed that if we decided to use
`SQLConf.PARQUET_TABLE_INCLUDE_TIMEZONE`, the default
`PARQUET_TIMEZONE_TABLE_PROPERTY` value should be `GMT` instead of session
local timezone as Hive
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
I looked over this pr and I noticed that you're using the
`TimeZone.getDefault()` as the "local" timezone.
Since the timestamp values are represented as "the number of micros since
epoch"
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin I've updated this to remove the conf entirely. So updating your
previous description, the new behavior can be described as:
when creating table:
* if a property
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75177/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75177 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75177/testReport)**
for PR 16781 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75177 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75177/testReport)**
for PR 16781 at commit
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
Ah, I see, it looks good to me for now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
> Do you mean the default SQLConf.PARQUET_TABLE_INCLUDE_TIMEZONE should be
false for now?
If so, I agree with it.
I was proposing something more drastic -- eliminating that conf
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75101/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75101 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75101/testReport)**
for PR 16781 at commit
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
Do you mean the default `SQLConf.PARQUET_TABLE_INCLUDE_TIMEZONE` should be
`false` for now?
If so, I agree with it.
---
If your project is set up for it, you can reply to this email and have
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #75101 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75101/testReport)**
for PR 16781 at commit
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
@ueshin thanks for taking a look. Yes, I think your summary is exactly
correct.
Your suggestion of using the session timezone makes a lot of sense. I can
update the pr accordingly. But I
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/16781
I checked HIVE-12767 and reviewed this pr roughly.
And I collected my thoughts about the desired behavior of this issue,
please correct me if I'm wrong.
when creating table:
-
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16781
pinging some potential reviewers:
@liancheng @yhuai for prior work in hive compatibility
@ueshin for work on timezone support
Note that the timezone support from SPARK-18350 is to
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74688/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #74688 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74688/testReport)**
for PR 16781 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #74688 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74688/testReport)**
for PR 16781 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74611/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16781
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16781
**[Test build #74611 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74611/testReport)**
for PR 16781 at commit
60 matches
Mail list logo