Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@gatorsmile Hive tables don't support case sensitive column names, so I use
data source tables in the added test cases. See [the followup
pr](https://github.com/apache/spark/pull/15360)
---
If your
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@rxin Thanks, I'll fix them in the followup pr.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/15090
LGTM except for one minor comment regarding ndv config document.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
Another test case for Unicode column names in ANALYZE COLUMN:
```Scala
// scalastyle:off
// non ascii characters are not allowed in the source code, so we
disable the
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66227 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66227/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66227 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66227/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66226/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66226 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66226/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66188 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66188/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66150/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66150 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66150/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66150 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66150/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66131/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66131 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66131/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66131 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66131/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66053/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66053 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66053/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
When encoding, I convert the InternalRow(UnsafeRow) into a byte array and
use Base64 to encode as a string; when decoding, use Base64 to decode from a
string and convert the byte array to an
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #66053 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66053/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
I have a problem in converting between InternalRow and String, as now we
are using InternalRow to represent ColumnStat.
Since we want to persist ColumnStat into metastore and we use table
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65994/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65994 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65994/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65994 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65994/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65950/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65950 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65950/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65950 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65950/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65941/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65941 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65941/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65941 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65941/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65935/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65935 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65935/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65935 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65935/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
To help us choose a better design, we need to first clarify the usage of
column stats.
A simple example may look like this (e.g. predicate: col < 5):
```java
filter.condition match {
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/15090
has an offline discussion with @wzhfy , here is the result:
1. The current `ColumnStats` is hard to use because most of its fields are
`Option`, some are `Option[Any]`, and we may need a
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
- For point 1, if we use different schema for every type, we need to do
type matching and convert to corresponding typed row every time we want to use
some column statistics, that would be tedious
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/15090
In this PR, we use `ColumnStats` to represent the column statistics in
memory, and persist in to hive metastore by converting it to string with format
`a=1,b=2`. This brings 2 problems:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65810/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65810 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65810/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65810 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65810/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65793/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65793 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65793/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65793 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65793/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@gatorsmile OK, I'll rebase and add some tests to handle negative cases.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
Although no conflict is detected, we should still fetch and merge the
latest master. Then, the changes made in `DataSourceStrategy.scala` will
disappaer.
---
If your project is set up for it,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65754/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65754 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65754/consoleFull)**
for PR 15090 at commit
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
The test suite `StatisticsColumnSuite` misses the negative cases. For
example, so far, we do not allow users to analyze the temporary tables.
Ideally, all the exceptions the code could
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65754 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65754/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65749/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65749 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65749/consoleFull)**
for PR 15090 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65746/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65746 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65746/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65749 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65749/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65746 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65746/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65745/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65745 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65745/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
The latest update includes the following changes:
1. move the test suite for column stats into sql/core;
2. extract the computing logic in `AnalyzeColumnCommand` as a separate
method;
3.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65745 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65745/consoleFull)**
for PR 15090 at commit
Github user srinathshankar commented on the issue:
https://github.com/apache/spark/pull/15090
Actually, I didn't mean to approve immediately, sorry.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@gatorsmile OK, I'll modify the test suite.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
A SQL function. : ) I might underestimate the effort.
Like what @hvanhovell said, how about adding a test suite in sql/core for
verifying [`updateStats
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/15090
I would argue strongly against creating a single aggregate function to
calculate all statistics, for a couple of reasons:
1. This would create a tonne of duplicated code (integrating
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/15090
What do you mean by a built-in function? A SQL function, or just a normal
Scala function? Sorry for asking because it is vague given the context we are
in.
---
If your project is set up for it, you
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
@rxin @wzhfy Just my 2 cents. To address @rxin 's comment, we can implement
a built-in function, `compute_stats`, like what Hive does. The actual
implementation of `AnalyzeColumnCommand` can be
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
How about change the returned result of `ANALYZE` command, as @gatorsmile
suggested in the
[comment](https://github.com/apache/spark/pull/15090#discussion_r79547634)?
Then we can compare collected
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/15090
I looked at your test cases. Are there any one that actually depend on
things in the Hive module? There is also an in-memory catalog for sql/core that
you can use.
In addition, unfortunately
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@rxin Because we want to test storing/loading these stats from metastore,
and make sure they are right after we load them into our catalogTable.
---
If your project is set up for it, you can reply
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/15090
@wzhfy one question: why are the test suites in sql/hive? Can't they live
in sql/core?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
This pr has been updated. Can you take another look? @rxin @hvanhovell
@cloud-fan @gatorsmile
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65690/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65690 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65690/consoleFull)**
for PR 15090 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65690 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65690/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
> Do you still remember the test case I showed you in table-level
statistics? A table with zero column. Can you add a test case for that scenario?
What's the purpose of adding this test?
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15090
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65633/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65633 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65633/consoleFull)**
for PR 15090 at commit
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
Do you still remember the test case I showed you in table-level statistics?
A table with zero column. Can you add a test case for that scenario?
---
If your project is set up for it, you can
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
This pr has been updated based on all the above comments, changes are as
follows:
1. Modify analyze syntax a little bit: `identifierSeq` is now non-optional,
i.e. users must specify column names
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15090
**[Test build #65633 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65633/consoleFull)**
for PR 15090 at commit
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@gatorsmile Yeah, my latest code contains lots of changes based on the
comments, I'll list them when I finish. And I'll also create a separate pr for
the bug fix. Thanks for the advices!
---
If
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
@wzhfy You also fixed a bug in this PR. Could you create a separate PR?
Then, we can backport it easily if needed. Sometimes, if you fix a bug in a
huge PR like this, we might be hard to
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/15090
@wzhfy In the implementation, you have a few limitations. Could you
improve/update your PR description? It can help the future code maintainers
understand what you did in this PR and why you did
Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/15090
@rxin Yeah, I think it's better to move histograms into ColumnStats than to
maintain two members like BasicColStats and Histograms. Let me rename
`BasicColStats` as `ColumnStats` so that all the
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/15090
What's "basic" about? Are we going to have something that's not basic in
the future (e.g. histogram)? If yes, should those go into a separate class or
just in ColumnStats?
---
If your project is
1 - 100 of 118 matches
Mail list logo