zhou degao created KYLIN-2486:
---------------------------------
Summary: buffer overflow when Extract Fact Table Distinct Columns
Key: KYLIN-2486
URL: https://issues.apache.org/jira/browse/KYLIN-2486
Project: Kylin
Issue Type: Bug
Components: Job Engine
Affects Versions: v1.6.0
Environment: kylin1.6
Reporter: zhou degao
Assignee: Dong Li
I want to know if the following error is caused by too long string field value.
Insane record: [spark, apache, branch-1.3, Michael Armbrust,
[email protected], Add combiner to avoid NPE when spark performs external
aggregation. 29effad [Michael Armbrust] Include alias in attributes that are
produced by overridden tables. 9990ec7 [Michael Armbrust] Merge pull request
#28 from liancheng/columnPruning f22df3a [Michael Armbrust] Merge pull
request #37 from yhuai/SerDe cf4db59 [Lian, Cheng] Added golden answers for
PruningSuite 54f165b [Lian, Cheng] Fixed spelling typo in two golden answer
file names 2682f72 [Lian, Cheng] Merge remote-tracking branch
'origin/master' into columnPruning c5a4fab [Lian, Cheng] Merge branch
'master' into columnPruning f670c8c [Yin Huai] Throw a NotImplementedError
for not supported clauses in a CTAS query. 128a9f8 [Yin Huai] Minor changes.
017872c [Yin Huai] Remove stats20 from whitelist. a1a4776 [Yin Huai]
Update comments. feb022c [Yin Huai] Partitioning key should be case
insensitive. 555fb1d [Yin Huai] Correctly set the extension for a text file.
d00260b [Yin Huai] Strips backticks from partition keys. 334aace [Yin
Huai] New golden files. a40d6d6 [Yin Huai] Loading the static partition
specified in a INSERT INTO/OVERWRITE query. 428aff5 [Yin Huai] Distinguish
`INSERT INTO` and `INSERT OVERWRITE`. eea75c5 [Yin Huai] Correctly set
codec. 45ffb86 [Yin Huai] Merge remote-tracking branch 'upstream/master'
into SerDeNew e089627 [Yin Huai] Code style. 563bb22 [Yin Huai] Set
compression info in FileSinkDesc. 35c9a8a [Michael Armbrust] Merge pull
request #46 from marmbrus/reviewFeedback bdab5ed [Yin Huai] Add a TODO for
loading data into partitioned tables. 5495fab [Yin Huai] Remove cloneRecords
which is no longer needed. 1596e1b [Yin Huai] Cleanup imports to make
IntelliJ happy. 3bb272d [Michael Armbrust] move org.apache.spark.sql
package.scala to the correct location. 8506c17 [Michael Armbrust] Address
review feedback. 3cb4f2e [Michael Armbrust] Merge pull request #45 from
tnachen/master 9ad474d [Michael Armbrust] Merge pull request #44 from
marmbrus/sampling 566fd66 [Timothy Chen] Whitelist tests and add support for
Binary type 69adf72 [Yin Huai] Set cloneRecords to false. a9c3188
[Timothy Chen] Fix udaf struct return 346f828 [Yin Huai] Move
SharkHadoopWriter to the correct location. 59e37a3 [Yin Huai] Merge
remote-tracking branch 'upstream/master' into SerDeNew ed3a1d1 [Yin Huai]
Load data directly into Hive. 7f206b5 [Michael Armbrust] Add support for
hive TABLESAMPLE PERCENT. b6de691 [Michael Armbrust] Merge pull request #43
from liancheng/fixMakefile 1f6260d [Lian, Cheng] Fixed package name and test
suite name in Makefile 5ae010f [Michael Armbrust] Merge pull request #42
from markhamstra/non-ascii 678341a [Mark Hamstra] Replaced non-ascii text
887f928 [Yin Huai] Merge remote-tracking branch 'upstream/master' into SerDeNew
1f7d00a [Reynold Xin] Merge pull request #41 from marmbrus/splitComponents
7588a57 [Michael Armbrust] Break into 3 major components and move everything
into the org.apache.spark.sql package. bc9a12c [Michael Armbrust] Move hive
test files. 5720d2b [Lian, Cheng] Fixed comment typo f0c3742 [Lian,
Cheng] Refactored PhysicalOperation f235914 [Lian, Cheng] Test case
udf_regex and udf_like need BooleanWritable registered cf691df [Lian, Cheng]
Added the PhysicalOperation to generalize ColumnPrunings 2407a21 [Lian,
Cheng] Added optimized logical plan to debugging output a7ad058 [Michael
Armbrust] Merge pull request #40 from marmbrus/includeGoldens 9329820
[Michael Armbrust] add golden answer files to repository dce0593 [Michael
Armbrust] move golden answer to the source code directory. 964368f [Michael
Armbrust] Merge pull request #39 from marmbrus/lateralView 7785ee6 [Michael
Armbrust] Tighten visibility based on comments. 341116c [Michael Armbrust]
address comments. 0e6c1d7 [Reynold Xin] Merge pull request #38 from
yhuai/parseDBNameInCTAS 2897deb [Michael Armbrust] fix scaladoc 7123225
[Yin Huai] Correctly parse the db name and table name in INSERT queries.
b376d15 [Michael Armbrust] fix newlines at EOF 5cc367c [Michael Armbrust]
use berkeley instead of cloudbees ff5ea3f [Michael Armbrust] new golden
db92adc [Michael Armbrust] more tests passing. clean up logging. 740febb
[Michael Armbrust] Tests for tgfs. 0ce61b0 [Michael Armbrust] Docs for
GenericHiveUdtf. ba8897f [Michael Armbrust] Merge remote-tracking branch
'yin/parseDBNameInCTAS' into lateralView dd00b7e [Michael Armbrust] initial
implementation of generators. ea76cf9 [Michael Armbrust] Add NoRelation to
planner. bea4b7f [Michael Armbrust] Add SumDistinct. 016b489 [Michael
Armbrust] fix typo. acb9566 [Michael Armbrust] Correctly type attributes of
CTAS. 8841eb8 [Michael Armbrust] Rename Transform -> ScriptTransformation.
02ff8e4 [Yin Huai] Correctly parse the db name and table name in a CTAS query.
5e4d9b4 [Michael Armbrust] Merge pull request #35 from marmbrus/smallFixes
5479066 [Reynold Xin] Merge pull request #36 from marmbrus/partialAgg
8017afb [Michael Armbrust] fix copy paste error. dc6353b [Michael Armbrust]
turn off deprecation cab1a84 [Michael Armbrust] Fix PartialAggregate
inheritance. 883006d [Michael Armbrust] improve tests. 32b615b [Michael
Armbrust] add override to asPartial. e1999f9 [Yin Huai] Use Deserializer and
Serializer instead of AbstractSerDe. f94345c [Michael Armbrust] fix doc link
d8cb805 [Michael Armbrust] Implement partial aggregation. ccdb07a
[Michael Armbrust] Fix bug where averages of strings are turned into sums of
strings. Remove a blank line. b4be6a5 [Michael Armbrust] better logging
when applying rules. 67128b8 [Reynold Xin] Merge pull request #30 from
marmbrus/complex cb57459 [Michael Armbrust] blacklist machine specific test.
2f27604 [Michael Armbrust] Address comments / style errors. 389525d
[Michael Armbrust] update golden, blacklist mr. e3c10bd [Michael Armbrust]
update whitelist. 44d343c [Michael Armbrust] Merge remote-tracking branch
'databricks/master' into complex 42ec4af [Michael Armbrust] improve complex
type support in hive udfs/udafs. ab5bff3 [Michael Armbrust] Support for get
item of map types. 1679554 [Michael Armbrust] add toString for if and IS NOT
NULL. ab9a131 [Michael Armbrust] when UDFs fail they should return null.
25288d0 [Michael Armbrust] Implement [] for arrays and maps. e7933e9
[Michael Armbrust] fix casting bug when working with fractional expressions.
010accb [Michael Armbrust] add tinyint to metastore type parser. 7a0f543
[Michael Armbrust] Avoid propagating types from unresolved nodes. ac9d7de
[Michael Armbrust] Resolve *s in Transform clauses. 692a477 [Michael
Armbrust] Support for wrapping arrays to be written into hive tables.
92e4158 [Reynold Xin] Merge pull request #32 from marmbrus/tooManyProjects
9c06778 [Michael Armbrust] fix serialization issues, add
JavaStringObjectInspector. 72a003d [Michael Armbrust] revert regex change
7661b6c [Michael Armbrust] blacklist machines specific tests aa430e7
[Michael Armbrust] Update .travis.yml e4def6b [Michael Armbrust] set
dataType for HiveGenericUdfs. 5e54aa6 [Michael Armbrust] quotes for struct
field names. bbec500 [Michael Armbrust] update test coverage, new golden
3734a94 [Michael Armbrust] only quote string types. 3f9e519 [Michael
Armbrust] use names w/ boolean args 5b3d2c8 [Michael Armbrust] implement
distinct. 5b33216 [Michael Armbrust] work on decimal support. 2c6deb3
[Michael Armbrust] improve printing compatibility. 35a70fb [Michael
Armbrust] multi-letter field names. a9388fb [Michael Armbrust] printing for
map types. c3feda7 [Michael Armbrust] use toArray. c654f19 [Michael
Armbrust] Support for list and maps in hive table scan. cf8d992 [Michael
Armbrust] Use built in functions for creating temp directory. 1579eec
[Michael Armbrust] Only cast unresolved inserts. 6420c7c [Michael Armbrust]
Memoize the ordinal in the GetField expression. da7ae9d [Michael Armbrust]
Add boolean writable that was breaking udf_regexp test. Not sure how this was
passing before... 6709441 [Michael Armbrust] Evaluation for accessing nested
fields. dc6463a [Michael Armbrust] Support for resolving access to nested
fields using "." notation. d670e41 [Michael Armbrust] Print nested fields
like hive does. efa7217 [Michael Armbrust] Support for reading structs in
HiveTableScan. 9c22b4e [Michael Armbrust] Support for parsing nested types.
82163e3 [Michael Armbrust] special case handling of partitionKeys when
casting insert into tables ea6f37f [Michael Armbrust] fix style. 7845364
[Michael Armbrust] deactivate concurrent test. b649c20 [Michael Armbrust]
fix test logging / caching. 1590568 [Michael Armbrust] add log4j.properties
19bfd74 [Michael Armbrust] store hive output in circular buffer dfb67aa
[Michael Armbrust] add test case cb775ac [Michael Armbrust] get rid of
SharkContext singleton 2de89d0 [Michael Armbrust] Merge pull request #13
from tnachen/master 63003e9 [Michael Armbrust] Fix spacing. 41b41f3
[Michael Armbrust] Only cast unresolved inserts. 6eb5960 [Michael Armbrust]
Merge remote-tracking branch 'databricks/master' into udafs 5b7afd8 [Michael
Armbrust] Merge pull request #10 from yhuai/exchangeOperator b1151a8
[Timothy Chen] Fix load data regex 8e0931f [Michael Armbrust] Cast to avoid
using deprecated hive API. e079f2b [Timothy Chen] Add GenericUDAF wrapper
and HiveUDAFFunction 45b334b [Yin Huai] fix comments 235cbb4 [Yin Huai]
Merge remote-tracking branch 'upstream/master' into exchangeOperator fc67b50
[Yin Huai] Check for a Sort operator with the global flag set instead of an
Exchange operator with a RangePartitioning. 6015f93 [Michael Armbrust] Merge
pull request #29 from rxin/style 271e483 [Michael Armbrust] Update build
status icon. d3a3d48 [Michael Armbrust] add testing to travis 807b2d7
[Michael Armbrust] check style and publish docs with travis d20b565 [Michael
Armbrust] fix if style bce024d [Michael Armbrust] Merge remote-tracking
branch 'databricks/master' into style Disable if brace checking as it errors in
single line functional cases unlike the style guide. d91e276 [Michael
Armbrust] Remove dependence on HIVE_HOME for running tests. This was done by
moving all the hive query test (from branch-0.12) and data files into
src/test/hive. These are used by default when HIVE_HOME is not set. f47c2f6
[Yin Huai] set outputPartitioning in BroadcastNestedLoopJoin 41bbee6 [Yin
Huai] Merge remote-tracking branch 'upstream/master' into exchangeOperator
7e24436 [Reynold Xin] Removed dependency on JDK 7 (nio.file). 5c1e600
[Reynold Xin] Added hash code implementation for AttributeReference 7213a2c
[Reynold Xin] style fix for Hive.scala. 08e4d05 [Reynold Xin] First round of
style cleanup. 605255e [Reynold Xin] Added scalastyle checker. 61e729c
[Lian, Cheng] Added ColumnPrunings strategy and test cases 2486fb7 [Lian,
Cheng] Fixed spelling 8ee41be [Lian, Cheng] Minor refactoring ebb56fa
[Michael Armbrust] add travis config 4c89d6e [Reynold Xin] Merge pull
request #27 from marmbrus/moreTests d4f539a [Michael Armbrust] blacklist mr
and user specific tests. 677eb07 [Michael Armbrust] Update test whitelist.
5dab0bc [Michael Armbrust] Merge pull request #26 from
liancheng/serdeAndPartitionPruning c263c84 [Michael Armbrust] Only push
predicates into partitioned table scans. ab77882 [Michael Armbrust] upgrade
spark to RC5. c98ede5 [Lian, Cheng] Response to comments from @marmbrus
83d4520 [Yin Huai] marmbrus's comments 70994a3 [Lian, Cheng] Revert
unnecessary Scaladoc changes 9ebff47 [Yin Huai] remove unnecessary .toSeq
e811d1a [Yin Huai] markhamstra's comments 4802f69 [Yin Huai] The
outputPartitioning of a UnaryNode inherits its child's outputPartitioning by
default. Also, update the logic in AddExchange to avoid unnecessary shuffling
operations. 040fbdf [Yin Huai] AddExchange is the only place to add Exchange
operators. 9fb357a [Yin Huai] use getSpecifiedDistribution to create
Distribution. ClusteredDistribution and OrderedDistribution do not take Nil as
inptu expressions. e9347fc [Michael Armbrust] Remove broken scaladoc links.
99c6707 [Michael Armbrust] upgrade spark 57799ad [Lian, Cheng] Added
special treat for HiveVarchar in InsertIntoHiveTable cb49af0 [Lian, Cheng]
Fixed Scaladoc links 4e5e4d4 [Lian, Cheng] Added PreInsertionCasts to do
necessary casting before insertion 111ffdc [Lian, Cheng] More comments and
minor reformatting 9e0d840 [Lian, Cheng] Added partition pruning
optimization 761bbb8 [Lian, Cheng] Generalized BindReferences to run against
any query plan 04eb5da [Yin Huai] Merge remote-tracking branch
'upstream/master' into exchangeOperator 9dd3b26 [Michael Armbrust] Fix
scaladoc. 6f44cac [Lian, Cheng] Made TableReader & HadoopTableReader private
to catalyst 7c92a41 [Lian, Cheng] Added Hive SerDe support ce5fdd6 [Yin
Huai] Merge remote-tracking branch 'upstream/master' into exchangeOperator
2957f31 [Yin Huai] addressed comments on PR 907db68 [Michael Armbrust] Space
after while. 04573a0 [Reynold Xin] Merge pull request #24 from
marmbrus/binaryCasts 4e50679 [Reynold Xin] Merge pull request #25 from
marmbrus/rowOrderingWhile 5bc1dc2 [Yin Huai] Merge remote-tracking branch
'upstream/master' into exchangeOperator be1fff7 [Michael Armbrust] Replace
foreach with while in RowOrdering. Fixes #23 fd084a4 [Michael Armbrust]
implement casts binary <=> string. 0b31176 [Michael Armbrust] Merge pull
request #22 from rxin/type 548e479 [Yin Huai] merge master into
exchangeOperator and fix code style 5b11db0 [Reynold Xin] Added Void to
Boolean type widening. 9e3d989 [Reynold Xin] Made
HiveTypeCoercion.WidenTypes more clear. 9bb1979 [Reynold Xin] Merge pull
request #19 from marmbrus/variadicUnion a2beb38 [Michael Armbrust] Merge
pull request #21 from liancheng/fixIssue20 b20a4d4 [Lian, Cheng] Fix issue
#20 6d6cb58 [Michael Armbrust] add source links that point to github to the
scala doc. 4285962 [Michael Armbrust] Remove temporary test cases 167162f
[Michael Armbrust] more merge errors, cleanup. e170ccf [Michael Armbrust]
Improve documentation and remove some spurious changes that were introduced by
the merge. 6377d0b [Michael Armbrust] Drop empty files, fix if ().
c0b0e60 [Michael Armbrust] cleanup broken doc links. 330a88b [Michael
Armbrust] Fix bugs in AddExchange. 4f345f2 [Michael Armbrust] Remove
SortKey, use RowOrdering. 043e296 [Michael Armbrust] Make physical union
nodes variadic. ece15e1 [Michael Armbrust] update unit tests 5c89d2e
[Michael Armbrust] Merge remote-tracking branch 'databricks/master' into
exchangeOperator Fix deprecated use of combineValuesByKey. Get rid of test
where the answer is dependent on the plan execution width. 9804eb5 [Michael
Armbrust] upgrade spark 053a371 [Michael Armbrust] Merge pull request #15
from marmbrus/orderedRow 5ab18be [Michael Armbrust] Merge remote-tracking
branch 'databricks/master' into orderedRow ca2ff68 [Michael Armbrust] Merge
pull request #17 from marmbrus/unionTypes bf9161c [Michael Armbrust] Merge
pull request #18 from marmbrus/noSparkAgg 563053f [Michael Armbrust] Address
@rxin's comments. 6537c66 [Michael Armbrust] Address @rxin's comments.
2a76fc6 [Michael Armbrust] add notes from @rxin. 685bfa1 [Michael Armbrust]
fix spelling 69ed98f [Michael Armbrust] Output a single row for empty
Aggregations with no grouping expressions. 7859a86 [Michael Armbrust] Remove
SparkAggregate. Its kinda broken and breaks RDD lineage. fc22e01 [Michael
Armbrust] whitelist newly passing union test. 3f547b8 [Michael Armbrust] Add
support for widening types in unions. 53b95f8 [Michael Armbrust] coercion
should not occur until children are resolved. b892e32 [Michael Armbrust]
Union is not resolved until the types match up. 95ab382 [Michael Armbrust]
Use resolved instead of custom function. This is better because some nodes
override the notion of resolved. 81a109d [Michael Armbrust] fix link.
f143f61 [Michael Armbrust] Implement sampling. Fixes a flaky test where the
JVM notices that RAND as a Comparison method "violates its general contract!"
6cd442b [Michael Armbrust] Use numPartitions variable, fix grammar. c800798
[Michael Armbrust] Add build status icon. 0cf5a75 [Michael Armbrust] Merge
pull request #16 from marmbrus/filterPushDown 05d3a0d [Michael Armbrust]
Refactor to avoid serializing ordering details with every row. f2fdd77
[Michael Armbrust] fix required distribtion for aggregate. 658866e [Michael
Armbrust] Pull back in changes made by @yhuai eliminating
CoGroupedLocallyRDD.scala 583a337 [Michael Armbrust] break apart
distribution and partitioning. e8d41a9 [Michael Armbrust] Merge
remote-tracking branch 'yin/exchangeOperator' into exchangeOperator 0ff8be7
[Michael Armbrust] Cleanup spurious changes and fix doc links. 73c70de [Yin
Huai] add a first set of unit tests for data properties. fbfa437 [Michael
Armbrust] Merge remote-tracking branch 'databricks/master' into filterPushDown
Minor doc improvements. 2b9d80f [Yin Huai] initial commit of adding exchange
operators to physical plans. fcbc03b [Michael Armbrust] Fix if ().
7b9080c [Michael Armbrust] Create OrderedRow class to allow ordering to be used
by multiple operators. b4adb0f [Michael Armbrust] Merge pull request #14
from marmbrus/castingAndTypes b2a1ec5 [Michael Armbrust] add comment on how
using numeric implicitly complicates spark serialization. e286d20 [Michael
Armbrust] address code review comments. 80d0681 [Michael Armbrust] fix
scaladoc links. de0c248 [Michael Armbrust] Print the executed plan in
SharkQuery toString. 3413e61 [Michael Armbrust] Add mapChildren and
withNewChildren methods to TreeNode. 404d552 [Michael Armbrust] Better
exception when unbound attributes make it to evaluation. fb84ae4 [Michael
Armbrust] Refactor DataProperty into Distribution. 2abb0bc [Michael
Armbrust] better debug messages, use exists. 098dfc4 [Michael Armbrust]
Implement Long sorting again. 60f3a9a [Michael Armbrust] More aggregate
functions out of the aggregate class to make things more readable. a1ef62e
[Michael Armbrust] Print the executed plan in SharkQuery toString. dfce426
[Michael Armbrust] Add mapChildren and withNewChildren methods to TreeNode.
037a2ed [Michael Armbrust] Better exception when unbound attributes make it to
evaluation. ec90620 [Michael Armbrust] Support for Sets as arguments to
TreeNode classes. b21f803 [Michael Armbrust] Merge pull request #11 from
marmbrus/goldenGen 83adb9d [Yin Huai] add DataProperty 5a26292 [Michael
Armbrust] Rules to bring casting more inline with Hive semantics. f0e0161
[Michael Armbrust] Move numeric types into DataTypes simplifying evaluator.
This can probably also be use for codegen... 6d2924d [Michael Armbrust] add
support for If. Not integrated in HiveQL yet. ccc4dbf [Michael Armbrust] Add
optimization rule to simplify casts. 058ec15 [Michael Armbrust] handle more
writeables. ffa9f25 [Michael Armbrust] blacklist some more MR tests.
aa2239c [Michael Armbrust] filter test lines containing Owner: f71a325
[Michael Armbrust] Update golden jar. a3003ae [Michael Armbrust] Update
makefile to use better sharding support. 568d150 [Michael Armbrust] Updates
to white/blacklist. 8351f25 [Michael Armbrust] Add an ignored test to remind
us we don't do empty aggregations right. c4104ec [Michael Armbrust] Numerous
improvements to testing infrastructure. See comments for details. 09c6300
[Michael Armbrust] Add nullability information to StructFields. 5460b2d
[Michael Armbrust] load srcpart by default. 3695141 [Michael Armbrust] Lots
of parser improvements. 965ac9a [Michael Armbrust] Add expressions that
allow access into complex types. 3ba53c9 [Michael Armbrust] Output type
suffixes on AttributeReferences. 8777489 [Michael Armbrust] Initial support
for operators that allow the user to specify partitioning. e57f97a [Michael
Armbrust] more decimal/null support. e1440ed [Michael Armbrust] Initial
support for function specific type conversions. 1814ed3 [Michael Armbrust]
use childrenResolved function. f2ec57e [Michael Armbrust] Begin supporting
decimal. 6924e6e [Michael Armbrust] Handle NullTypes when resolving HiveUDFs
7fcfa8a [Michael Armbrust] Initial support for parsing unspecified partition
parameters. d0124f3 [Michael Armbrust] Correctly type null literals.
b65626e [Michael Armbrust] Initial support for parsing BigDecimal. a90efda
[Michael Armbrust] utility function for outputing string stacktraces.
7102f33 [Michael Armbrust] methods with side-effects should use (). 3ccaef7
[Michael Armbrust] add renaming TODO. bc282c7 [Michael Armbrust] fix bug in
getNodeNumbered c8e89d5 [Michael Armbrust] memoize inputSet calculation.
6aefa46 [Michael Armbrust] Skip folding literals. a72e540 [Michael Armbrust]
Add IN operator. 04f885b [Michael Armbrust] literals are only non-nullable
if they are not null. 35d2948 [Michael Armbrust] correctly order partition
and normal attributes in hive relation output. 12fd52d [Michael Armbrust]
support for sorting longs. 0606520 [Michael Armbrust] drop old comment.
859200a [Michael Armbrust] support for reading more types from the metastore.
1fedd18 [Michael Armbrust] coercion from null to numeric types 71e902d
[Michael Armbrust] fix test cases. cc06b6c [Michael Armbrust] Merge
remote-tracking branch 'databricks/master' into interviewAnswer 8a8b521
[Reynold Xin] Merge pull request #8 from marmbrus/testImprovment 86355a6
[Michael Armbrust] throw error if there are unexpected join clauses. c5842d2
[Michael Armbrust] don't throw an error when a select clause outputs multiple
copies of the same attribute. 0e975ea [Michael Armbrust] parse bucket
sampling as percentage sampling a92919d [Michael Armbrust] add alter view as
to native commands f58d5a5 [Michael Armbrust] support for parsing SELECT
DISTINCT f0faa26 [Michael Armbrust] add sample and distinct operators.
ef7b943 [Michael Armbrust] add metastore support for float e9f4588 [Michael
Armbrust] fix > 100 char. 755b229 [Michael Armbrust] blacklist some ddl
tests. 9ae740a [Michael Armbrust] blacklist more tests that require MR.
4cfc11a [Michael Armbrust] more test coverage. 0d9d56a [Michael Armbrust]
add more native commands to parser 78d730d [Michael Armbrust] Load src test
table on RESET. 8364ec2 [Michael Armbrust] whitelist all possible partition
values. b01468d [Michael Armbrust] support path rewrites when the query
begins with a comment. 4c6b454 [Michael Armbrust] add option for recomputing
the cached golden answer when tests fail. 4c5fb0f [Michael Armbrust]
makefile target for building new whitelist. 4b6fed8 [Michael Armbrust]
support for parsing both DESTINATION and INSERT_INTO. 516481c [Michael
Armbrust] Ignore requests to explain native commands. 68aa2e6 [Michael
Armbrust] Stronger type for Token extractor. ca4ea26 [Michael Armbrust]
Support for parsing UDF(*). 1aafea3 [Michael Armbrust] Configure partition
whitelist in TestShark reset. 9627616 [Michael Armbrust] Use current
database as default database. 9b02b44 [Michael Armbrust] Fix spelling error.
Add failFast mode. 6f64cee [Michael Armbrust] don't line wrap string literal
eafaeed [Michael Armbrust] add type documentation f54c94c [Michael
Armbrust] make golden answers file a test dependency 5362365 [Michael
Armbrust] push conditions into join 0d2388b [Michael Armbrust] Point at
databricks hosted scaladoc. 73b29cd [Michael Armbrust] fix bad casting
9aa06c5 [Michael Armbrust] Merge pull request #7 from marmbrus/docFixes
7eff191 [Michael Armbrust] link all the expression names. 83227e4 [Michael
Armbrust] fix scaladoc list syntax, add docs for some rules 9de6b74 [Michael
Armbrust] fix language feature and deprecation warnings. 0b1960a [Michael
Armbrust] Fix broken scala doc links / warnings. b1acb36 [Michael Armbrust]
Merge pull request #3 from yhuai/evalauteLiteralsInExpressions 01c00c2
[Michael Armbrust] new golden 5c14857 [Yin Huai] Merge remote-tracking
branch 'upstream/master' into evalauteLiteralsInExpressions b749b51 [Michael
Armbrust] Merge pull request #5 from marmbrus/testCaching 66adceb [Michael
Armbrust] Merge pull request #6 from marmbrus/joinWork 1a393da [Yin Huai]
folded -> foldable 1e964ea [Yin Huai] update a43d41c [Michael Armbrust]
more tests passing! 8ca38d0 [Michael Armbrust] begin support for varchar /
binary types. ab8bbd1 [Michael Armbrust] parsing % operator c16c8b5
[Michael Armbrust] case insensitive checking for hooks in tests. 3a90a5f
[Michael Armbrust] simpler output when running a single test from the
commandline. 5332fee [Yin Huai] Merge remote-tracking branch
'upstream/master' into evalauteLiteralsInExpressions 367fb9e [Yin Huai]
update 0cd5cc6 [Michael Armbrust] add BIGINT cast parsing 61b266f
[Michael Armbrust] comment for eliminate subqueries. d72a5a2 [Michael
Armbrust] add long to literal factory object. b3bd15f [Michael Armbrust]
blacklist more mr requiring tests. e06fd38 [Michael Armbrust] black list map
reduce tests. 8e7ce30 [Michael Armbrust] blacklist some env specific tests.
6250cbd [Michael Armbrust] Do not exit on test failure b22b220 [Michael
Armbrust] also look for cached hive test answers on the classpath. b6e4899
[Yin Huai] formatting e75c90d [Reynold Xin] Merge pull request #4 from
marmbrus/hive12 5fabbec [Michael Armbrust] ignore partitioned scan test.
scan seems to be working but there is some error about the table already
existing? 9e190f5 [Michael Armbrust] drop unneeded () 68b58c1 [Michael
Armbrust] drop a few more tests. b0aa400 [Michael Armbrust] update
whitelist. c99012c [Michael Armbrust] skip tests with hooks db00ebf
[Michael Armbrust] more types for hive udfs dbc3678 [Michael Armbrust]
update ghpages repo 138f53d [Yin Huai] addressed comments and added a space
after a space after the defining keyword of every control structure. 6f954ee
[Michael Armbrust] export the hadoop classpath when starting sbt, required to
invoke hive during tests. 46bf41b [Michael Armbrust] add a makefile for
priming the test answer cache in parallel. usage: "make -j 8 -i" 8d47ed4
[Yin Huai] comment 2795f05 [Yin Huai] comment e003728 [Yin Huai] move
OptimizerSuite to the package of catalyst.optimizer 2941d3a [Yin Huai] Merge
remote-tracking branch 'upstream/master' into evalauteLiteralsInExpressions
0bd1688 [Yin Huai] update 6a7bd75 [Michael Armbrust] fix partition column
delimiter configuration. e942da1 [Michael Armbrust] Begin upgrade to Hive
0.12.0. b8cd7e3 [Michael Armbrust] Merge pull request #7 from rxin/moreclean
52864da [Reynold Xin] Added executeCollect method to SharkPlan. f0e1cbf
[Reynold Xin] Added resolved lazy val to LogicalPlan. b367e36 [Reynold Xin]
Replaced the use of ??? with UnsupportedOperationException. 38124bd [Yin
Huai] formatting 2924468 [Yin Huai] add two tests for testing pre-order and
post-order tree traversal, respectively 555d839 [Reynold Xin] More cleaning
... d48d0e1 [Reynold Xin] Code review feedback. aa2e694 [Yin Huai] Merge
remote-tracking branch 'upstream/master' into evalauteLiteralsInExpressions
5c421ac [Reynold Xin] Imported SharkEnv, SharkContext, and HadoopTableReader to
remove Shark dependency. 479e055 [Reynold Xin] A set of minor changes,
including: - import order - limit some lines to 100 character wide - inline
code comment - more scaladocs - minor spacing (i.e. add a space after if)
da16e45 [Reynold Xin] Merge pull request #3 from rxin/packagename e36caf5
[Reynold Xin] Renamed Rule.name to Rule.ruleName since name is used too
frequently in the code base and is shadowed often by local scope. 72426ed
[Reynold Xin] Rename shark2 package to execution. 0892153 [Reynold Xin]
Merge pull request #2 from rxin/packagename e58304a [Reynold Xin] Merge pull
request #1 from rxin/gitignore 3f9fee1 [Michael Armbrust] rewrite push
filter through join optimization. c6527f5 [Reynold Xin] Moved the test src
files into the catalyst directory. c9777d8 [Reynold Xin] Put all source
files in a catalyst directory. 019ea74 [Reynold Xin] Updated .gitignore to
include IntelliJ files. 80ca4be [Timothy Chen] Address comments 0079392
[Michael Armbrust] support for multiple insert commands in a single query
75b5a01 [Michael Armbrust] remove space. 4283400 [Timothy Chen] Add limited
predicate push down e547e50 [Michael Armbrust] implement First. e77c9b6
[Michael Armbrust] more work on unique join. c795e06 [Michael Armbrust]
improve star expansion a26494e [Michael Armbrust] allow aliases to have
qualifiers d078333 [Michael Armbrust] remove extra space a75c023 [Michael
Armbrust] implement Coalesce 3a018b6 [Michael Armbrust] fix up docs.
ab6f67d [Michael Armbrust] import the string "null" as actual null. 5377c04
[Michael Armbrust] don't call dataType until checking if children are resolved.
191ce3e [Michael Armbrust] analyze rewrite test query. 60b1526 [Michael
Armbrust] don't call dataType until checking if children are resolved.
2ab5a32 [Michael Armbrust] stop using uberjar as it has its own set of issues.
e42f75a [Michael Armbrust] Merge remote-tracking branch 'origin/master' into
HEAD c086a35 [Michael Armbrust] docs, spacing c4060e4 [Michael Armbrust]
cleanup 3b85462 [Michael Armbrust] more tests passing bcfc8c5 [Michael
Armbrust] start supporting partition attributes when inserting data. c944a95
[Michael Armbrust] First aggregate expression. 1e28311 [Michael Armbrust]
make tests execute in alpha order again a287481 [Michael Armbrust] spelling
8492548 [Michael Armbrust] beginning of UNIQUEJOIN parsing. a6ab6c7
[Michael Armbrust] add != 4529594 [Michael Armbrust] draft of coalesce
70f253f [Michael Armbrust] more tests passing! 7349e7b [Michael Armbrust]
initial support for test thrift table d3c9305 [Michael Armbrust] fix > 100
char line 93b64b0 [Michael Armbrust] load test tables that are args to
"DESCRIBE" 06b2aba [Michael Armbrust] don't be case sensitive when fixing
load paths 6355d0e [Michael Armbrust] match actual return type of count with
expected cda43ab [Michael Armbrust] don't throw an exception when one of the
join tables is empty. fd4b096 [Michael Armbrust] fix casing of null strings
as well. 4632695 [Michael Armbrust] support for megastore bigint 67b88cf
[Michael Armbrust] more verbose debugging of evaluation return types c680e0d
[Michael Armbrust] Failed string => number conversion should return null.
2326be1 [Michael Armbrust] make getClauses case insensitive. dac2786
[Michael Armbrust] correctly handle null values when going from string to
numeric types. 045ac4b [Yin Huai] Merge remote-tracking branch
'upstream/master' into evalauteLiteralsInExpressions fb5ddfd [Michael
Armbrust] move ViewExamples to examples/ 83833e8 [Michael Armbrust] more
tests passing! 47c98d6 [Michael Armbrust] add query tests for like and hash.
1724c16 [Michael Armbrust] clear lines that contain last updated times.
cfd6bbc [Michael Armbrust] Quick skipping of tests that we can't even parse.
9b2642b [Michael Armbrust] make the blacklist support regexes 1d50af6
[Michael Armbrust] more datatypes, fix nonserializable instance variables in
udfs 910e33e [Michael Armbrust] basic support for building an assembly jar.
d55bb52 [Michael Armbrust] add local warehouse/metastore to gitignore.
495d9dc [Michael Armbrust] Add an expression for when we decide to support LIKE
natively instead of using the HIVE udf. 65f4e69 [Michael Armbrust] remove
incorrect comments 0831a3c [Michael Armbrust] support for parsing some
operator udfs. 6c27aa7 [Michael Armbrust] more cast parsing. 43db061
[Michael Armbrust] significant generalization of hive udf functionality.
3fe24ec [Michael Armbrust] better implementation of 3vl in Evaluate, fix some >
100 char lines. e5690a6 [Michael Armbrust] add BinaryType adab892
[Michael Armbrust] Clear out functions that are created during tests when reset
is called. d408021 [Michael Armbrust] support for printing out arrays in the
output in the same form as hive (e.g., [e1, e1]). 8d5f504 [Michael Armbrust]
Example of schema RDD using scala's dynamic trait, resulting in a more standard
ORM style of usage. 21f0d91 [Michael Armbrust] Simple example of schemaRdd
with scala filter function. 0daaa0e [Michael Armbrust] Promote booleans that
appear in comparisons. 2b70abf [Michael Armbrust] true and false literals.
ef8b0a5 [Michael Armbrust] more tests. 14d070f [Michael Armbrust] add
support for correctly extracting partition keys. 0afbe73 [Yin Huai] Merge
remote-tracking branch 'upstream/master' into evalauteLiteralsInExpressions
69a0bd4 [Michael Armbrust] promote strings in predicates with number too.
3946e31 [Michael Armbrust] don't build strings unless assertion fails.
90c453d [Michael Armbrust] more tests passing! 6e6417a [Michael Armbrust]
correct handling of nulls in boolean logic and sorting. 8000504 [Michael
Armbrust] Improve type coercion. 9087152 [Michael Armbrust] fix toString of
Not. 58b111c [Michael Armbrust] fix bad scaladoc tag. d5c05c6 [Michael
Armbrust] For now, ignore the big data benchmark tests when the data isn't
there. ac6376d [Michael Armbrust] Split out general shark query execution
driver from test harness. 1d0ae1e [Michael Armbrust] Switch from
IndexSeq[Any] to Row interface that will allow us unboxed access to primitive
types. d873b2b [Yin Huai] Remove numbers associated with test cases.
8545675 [Yin Huai] Merge remote-tracking branch 'upstream/master' into
evalauteLiteralsInExpressions b34a9eb [Michael Armbrust] Merge branch
'master' into filterPushDown d1e7b8e [Michael Armbrust] Update README.md
c8b1553 [Michael Armbrust] Update README.md 9307ef9 [Michael Armbrust]
update list of passing tests. 934c18c [Michael Armbrust] Filter out
non-deterministic lines when comparing test answers. a045c9c [Michael
Armbrust] SparkAggregate doesn't actually support sum right now. ae0024a
[Yin Huai] update cf80545 [Yin Huai] Merge remote-tracking branch
'origin/evalauteLiteralsInExpressions' into evalauteLiteralsInExpressions
21976ae [Yin Huai] update b4999fe [Yin Huai] Merge remote-tracking branch
'upstream/filterPushDown' into evalauteLiteralsInExpressions dedbf0c [Yin
Huai] support Boolean literals eaac9e2 [Yin Huai] explain the limitation of
the current EvaluateLiterals 37817b5 [Yin Huai] add a comment to
EvaluateLiterals. 468667f [Yin Huai] First draft of literal evaluation in
the optimization phase. TreeNode has been extended to support transform in the
post order. So, for an expression, we can evaluate literal from the leaf nodes
of this expression tree. For an attribute reference in the expression node, we
just leave it as is. b1d1843 [Michael Armbrust] more work on big data
benchmark tests. cc9a957 [Michael Armbrust] support for creating test tables
outside of TestShark 7d7fa9f [Michael Armbrust] support for create table as
5f54f03 [Michael Armbrust] parsing for ASC d42b725 [Michael Armbrust] Sum
of strings requires cast 34b30fa [Michael Armbrust] not all attributes need
to be bound (e.g. output attributes that are contained in non-leaf operators.)
81659cb [Michael Armbrust] implement transform operator. 5cd76d6 [Michael
Armbrust] break up the file based test case code for reuse 1031b65 [Michael
Armbrust] support for case insensitive resolution. 320df04 [Michael
Armbrust] add snapshot repo for databricks (has shark/spark snapshots)
b6f083e [Michael Armbrust] support for publishing scala doc to github from sbt
d9d18b4 [Michael Armbrust] debug logging implicit. 669089c [Yin Huai]
support Boolean literals ef3321e [Yin Huai] explain the limitation of the
current EvaluateLiterals 73a05fd [Yin Huai] add a comment to
EvaluateLiterals. 191eb7d [Yin Huai] First draft of literal evaluation in
the optimization phase. TreeNode has been extended to support transform in the
post order. So, for an expression, we can evaluate literal from the leaf nodes
of this expression tree. For an attribute reference in the expression node, we
just leave it as is. 80039cc [Yin Huai] Merge pull request #1 from
yhuai/master cbe1ca1 [Yin Huai] add explicit result type to the overloaded
sideBySide 5c518e4 [Michael Armbrust] fix bug in test. b50dd0e [Michael
Armbrust] fix return type of overloaded method 05679b7 [Michael Armbrust]
download assembly jar for easy compiling during interview. 8c60cc0 [Michael
Armbrust] Update README.md 03b9526 [Michael Armbrust] First draft of
optimizer tests. f392755 [Michael Armbrust] Add flatMap to TreeNode
6cbe8d1 [Michael Armbrust] fix bug in side by side, add support for working
with unsplit strings 15a53fc [Michael Armbrust] more generic sum calculation
and better binding of grouping expressions. 06749d0 [Michael Armbrust] add
expression enumerations for query plan operators and recursive version of
transform expression. 4b0a888 [Michael Armbrust] implement string comparison
and more casts. 356b321 [Michael Armbrust] Update README.md 3776395
[Michael Armbrust] Update README.md 304d17d [Michael Armbrust] Create
README.md b7d8be0 [Michael Armbrust] more tests passing. b82481f [Michael
Armbrust] add todo comment. 02e6dee [Michael Armbrust] add another test that
breaks the harness to the blacklist. cc5efe3 [Michael Armbrust] First draft
of broadcast nested loop join with full outer support. c43a259 [Michael
Armbrust] comments 15ff448 [Michael Armbrust] better error message when a
dsl test throws an exception 76ec650 [Michael Armbrust] fix join conditions
e10df99 [Michael Armbrust] Create new expr ids for local relations that exist
more than once in a query plan. 91573a4 [Michael Armbrust] initial type
promotion e2ef4a5 [Michael Armbrust] logging e43dc1e [Michael Armbrust]
add string => int cast evaluation f1f7e96 [Michael Armbrust] fix incorrect
generation of join keys 2b27230 [Michael Armbrust] add depth based subtree
access 0f6279f [Michael Armbrust] broken tests. 389bc0b [Michael
Armbrust] support for partitioned columns in output. 12584f4 [Michael
Armbrust] better errors for missing clauses. support for matching multiple
clauses with the same name. b67a225 [Michael Armbrust] better errors when
types don't match up. 9e74808 [Michael Armbrust] add children resolved.
6d03ce9 [Michael Armbrust] defaults for unresolved relation 2469b00 [Michael
Armbrust] skip nodes with unresolved children when doing coersions be5ae2c
[Michael Armbrust] better resolution logging cb7b5af [Michael Armbrust]
views example 420e05b [Michael Armbrust] more tests passing! 6916c63
[Michael Armbrust] Reading from partitioned hive tables. a1245f9 [Michael
Armbrust] more tests passing 956e760 [Michael Armbrust] extended explain
5f14c35 [Michael Armbrust] more test tables supported 175c43e [Michael
Armbrust] better errors for parse exceptions 480ade5 [Michael Armbrust]
don't use partial cached results. 8a9d21c [Michael Armbrust] fix evaluation
7aee69c [Michael Armbrust] parsing for joins, boolean logic 7fcf480
[Michael Armbrust] test for and logic 3ea9b00 [Michael Armbrust] don't use
simpleString if there are no new lines. 6902490 [Michael Armbrust] fix
boolean logic evaluation 4d5eba7 [Michael Armbrust] add more dsl for
expression arithmetic and boolean logic 8b2a2ee [Michael Armbrust] more
tests passing! ad1f3b4 [Michael Armbrust] toString for null literals
a5c0a1b [Michael Armbrust] more test harness improvements: * regex whitelist *
side by side answer comparison (still needs formatting work) 60ec19d
[Michael Armbrust] initial support for udfs c45b440 [Michael Armbrust]
support for is (not) null and boolean logic 7f4a1dc [Michael Armbrust] add
NoRelation logical operator 72e183b [Michael Armbrust] support for null
values in tree node args. ad596d2 [Michael Armbrust] add sc to Union's
otherCopyArgs e5c9d1a [Michael Armbrust] use nonEmpty dcc4fe1 [Michael
Armbrust] support for src1 test table. c78b587 [Michael Armbrust] casting.
75c3f3f [Michael Armbrust] add support for logging with scalalogging.
da2c011 [Michael Armbrust] make it more obvious when results are being
truncated. 96b73ba [Michael Armbrust] more docs in TestShark 18524fd
[Michael Armbrust] add method to SharkSqlQuery for directly executing the same
query on hive. e6d063b [Michael Armbrust] more join tests. 664c1c3
[Michael Armbrust] make parsing of function names case insensitive. 0967d4e
[Michael Armbrust] fix hardcoded path to hiveDevHome. 1a6db68 [Michael
Armbrust] spelling 7638cb4 [Michael Armbrust] simple join execution with dsl
tests. no hive tests yes. 859d4c9 [Michael Armbrust] better argString
printing of nested trees. fc53615 [Michael Armbrust] add same instance
comparisons for tree nodes. a026e6b [Michael Armbrust] move out hive
specific operators fff4d1c [Michael Armbrust] add simple query execution
debugging e2120ab [Michael Armbrust] sorting for strings da06eb6 [Michael
Armbrust] Parsing for sortby and joins 9eb5c5e [Michael Armbrust] override
equality in Attribute references to compare exprId. 8eb2460 [Michael
Armbrust] add system property to override whitelist. 88124bb [Michael
Armbrust] make strategy evaluation lazy. 74a3a21 [Michael Armbrust]
implement outputSet d25b171 [Michael Armbrust] Add AND and OR expressions
67f0a4a [Michael Armbrust] dsl improvements: string to attribute, subquery,
unionAll 12acf0a [Michael Armbrust] add .DS_Store for macs f7da6ce
[Michael Armbrust] add agg with grouping expr in select test 36805b3
[Michael Armbrust] pull out and improve aggregation 75613e1 [Michael
Armbrust] better evaluations failure messages. 4789a35 [Michael Armbrust]
weaken type since its hard to create pure references. e89dd36 [Michael
Armbrust] no newline for online trees d0590d4 [Michael Armbrust] include
stack trace for catalyst failures. 081c0d9 [Michael Armbrust] more generic
computation of agg functions. 31af3a0 [Michael Armbrust] fail when clauses
are unhandeled in the parser ecd45b2 [Michael Armbrust] Add more passing
tests. 97d5419 [Michael Armbrust] fix alignment. 565cc13 [Michael
Armbrust] make the canary query optional. a95e65c [Michael Armbrust] support
for resolving qualified attribute references. e1dfa0c [Michael Armbrust]
better error reporting for comparison tests when hive works but catalyst fails.
4640a0b [Michael Armbrust] handle test tables when database is specified.
bef12e3 [Michael Armbrust] Add Subquery node and trivial optimizer to remove it
after analysis. fec5158 [Michael Armbrust] add hive / idea files to
.gitignore 3f97ffe [Michael Armbrust] Rename Hive => HiveQl 656b836
[Michael Armbrust] Support for parsing select clause aliases. 3ca7414
[Michael Armbrust] StopAfter needs otherCopyArgs. 3ffde66 [Michael Armbrust]
When the child of an alias is unresolved it should return an unresolved
attribute instead of throwing an exception. 8cbef8a [Michael Armbrust]
spelling aa8c37c [Michael Armbrust] Better toString for SortOrder 1bb8b45
[Michael Armbrust] fix error message for UnresolvedExceptions a2e0327
[Michael Armbrust] add a bunch of tests. 4a3e1ea [Michael Armbrust] docs and
use shark for data loading. 339bb8f [Michael Armbrust] better docs, Not
support 1d7b2d9 [Michael Armbrust] Add NaN conversions. 46a2534 [Michael
Armbrust] only run canary query on failure. 8996066 [Michael Armbrust]
remove protected from makeCopy 53bcf41 [Michael Armbrust] testing
improvements: * reset hive vars * delete indexes and tables * delete database *
reset to use default database * record tests that pass 04a372a [Michael
Armbrust] add a flag for running all tests. 3b2235b [Michael Armbrust] More
general implementation of arithmetic. edd7795 [Michael Armbrust] More
testing improvements: * Check that results match for native commands * Ensure
explain commands can be planned * Cache hive "golden" results da6c577
[Michael Armbrust] add string <==> file utility functions. 3adf5ca [Michael
Armbrust] Initial support for groupBy and count. 7bcd8a4 [Michael Armbrust]
Improvements to comparison tests: * Sort answer when query doesn't contain an
order by. * Display null values the same as Hive. * Print full query results in
easy to read format when they differ. a52e7c9 [Michael Armbrust] Transform
children that are present in sequences of the product. d66ba7e [Michael
Armbrust] drop printlns. 88f2efd [Michael Armbrust] Add sum / count distinct
expressions. 05adedc [Michael Armbrust] rewrite relative paths when loading
data in TestShark 07784b3 [Michael Armbrust] add support for rewriting paths
and running 'set' commands. b8a9910 [Michael Armbrust] quote tests passing.
8e5e267 [Michael Armbrust] handle aliased select expressions. 4286a96
[Michael Armbrust] drop debugging println ac34aeb [Michael Armbrust] proof
of concept for hive ast transformations. 2238b00 [Michael Armbrust] better
error when makeCopy functions fails due to incorrect arguments ff1eab8
[Michael Armbrust] start trying to make insert into hive table more general.
74a6337 [Michael Armbrust] use fastEquals when doing transformations.
1184a23 [Michael Armbrust] add native test for escapes. b972b18 [Michael
Armbrust] create BaseRelation class fa6bce9 [Michael Armbrust] implement
union 6391a87 [Michael Armbrust] count aggregate. d47c317 [Michael
Armbrust] add unary minus, more tests passing. c7114e4 [Michael Armbrust]
first draft of star expansion. 044c43d [Michael Armbrust] better support for
numeric literal parsing. 1d0f072 [Michael Armbrust] use native drop table as
it doesn't appear to fail when the "table" is actually a view. 61503c5
[Michael Armbrust] add cached toRdd 2036883 [Michael Armbrust] skip explain
queries when testing. ebac4b1 [Michael Armbrust] fix bug in sort reference
calculation ca0dee0 [Michael Armbrust] docs. 1ee0471 [Michael Armbrust]
string literal parsing. 357278b [Michael Armbrust] add limit support
9b3e479 [Michael Armbrust] creation of string literals. 02efa30 [Michael
Armbrust] alias evaluation cb68b33 [Michael Armbrust] parsing for random
sample in hive ql. 126dd36 [Michael Armbrust] include query plans in failure
output bb59ae9 [Michael Armbrust] doc fixes 7e68286 [Michael Armbrust]
fix confusing naming 768bb25 [Michael Armbrust] handle errors in shark query
toString 829c3ce [Michael Armbrust] Auto loading of test data on demand. Add
reset method to test shark. Make test shark a singleton to avoid weirdness
with the hive megastore. ad02e41 [Michael Armbrust] comment jdo dependency
7bc89fe [Michael Armbrust] add collect to TreeNode. 438cf74 [Michael
Armbrust] create explicit treeString function in addition to toString override.
docs. 09679ee [Michael Armbrust] fix bug in TreeNode foreach 2930b27
[Michael Armbrust] more specific name for del query tests. 8842549 [Michael
Armbrust] docs. da81f81 [Michael Armbrust] Implementation and tests for
simple AVG query in Hive SQL. a8969b9 [Michael Armbrust] Factor out hive
query comparison test framework. 1a7efb0 [Michael Armbrust] specialize spark
aggregate for global aggregations. a36dd9a [Michael Armbrust] evaluation for
other > data types. cae729b [Michael Armbrust] remove unnecessary lazy vals.
d8e12af [Michael Armbrust] docs 3a60d67 [Michael Armbrust] implement
average, placeholder for count f05c106 [Michael Armbrust] checkAnswer
handles single row results. 2730534 [Michael Armbrust] implement inputSet
a9aa79d [Michael Armbrust] debugging for sort exec 8bec3c9 [Michael
Armbrust] better tree makeCopy when there are two constructors. 554b4b2
[Michael Armbrust] BoundAttribute pretty printing. 754f5fa [Michael
Armbrust] dsl for setting nullability a206d7a [Michael Armbrust] clean up
query tests. 84ad6ef [Michael Armbrust] better sort implementation and
tests. de24923 [Michael Armbrust] add double type. 9611a2c [Michael
Armbrust] literal creation for doubles. 7358313 [Michael Armbrust] sort
order returns child type. b544715 [Michael Armbrust] implement eval for
rand, and > for doubles 7013bad [Michael Armbrust] asc, desc should work for
expressions and unresolved attributes (symbols) 1c1a35e [Michael Armbrust]
add simple Rand expression. 3ca51de [Michael Armbrust] add orderBy to dsl
7ae41ab [Michael Armbrust] more literal implicit conversions b18b675
[Michael Armbrust] First cut at native query tests for shark. d392e29
[Michael Armbrust] add toRdd implicit conversion for logical plans in
TestShark. 5eac895 [Michael Armbrust] better error when descending is
specified. 2b16f86 [Michael Armbrust] add todo e527bb8 [Michael Armbrust]
remove arguments to binary predicate constructor as they seem to break
serialization 9dde3c8 [Michael Armbrust] add project and filter operations.
ad9037b [Michael Armbrust] Add support for local relations. 6227143
[Michael Armbrust] evaluation of Equals. 7526290 [Michael Armbrust]
BoundReference should also be an Attribute. bd33e26 [Michael Armbrust] more
documentation 5de0ea3 [Michael Armbrust] Move all shark specific into a
separate package. Lots of documentation improvements. 0ae292b [Michael
Armbrust] implement calculation of sort expressions. 9fd5011 [Michael
Armbrust] First cut at expression evaluation. 6259e3a [Michael Armbrust]
cleanup 787e5a2 [Michael Armbrust] use fastEquals f90da36 [Michael
Armbrust] better printing of optimization exceptions b05dd67 [Michael
Armbrust] Application of rules to fixed point. bb2e0db [Michael Armbrust]
pretty print for literals. 1ec3287 [Michael Armbrust] Add extractor for
IntegerLiterals. d3a3687 [Michael Armbrust] add fastEquals 2b4935b
[Michael Armbrust] set sbt.version explicitly 46dfd7f [Michael Armbrust]
first cut at checking answer for HiveCompatability tests. c79f2fd [Michael
Armbrust] insert operator should return an empty rdd. 14c22ec [Michael
Armbrust] implement sorting when the sort expression is the first attribute of
the input. ae7b4c3 [Michael Armbrust] remove implicit dependencies. now
compiles without copying things into lib/ manually. 84082f9 [Michael
Armbrust] add sbt binaries and scripts 15371a8 [Michael Armbrust] First
draft of simple Hive DDL parser. 063bf44 [Michael Armbrust] Periods should
end all comments. e1f7f4c [Michael Armbrust] Remove "NativePlaceholder"
hack. ed3633e [Michael Armbrust] start consolidating Hive/Shark specific
code. first hive compatibility test case passing! b34a770 [Michael Armbrust]
Add data sink strategy, make strategy application a little more robust.
e7174ec [Michael Armbrust] fix schema, add docs, make helper method protected.
26f410a [Michael Armbrust] physical traits should extend PhysicalPlan.
dc72469 [Michael Armbrust] beginning of hive compatibility testing framework.
0763490 [Michael Armbrust] support for hive native command pass-through.
d8a924f [Michael Armbrust] scaladoc 29a7163 [Michael Armbrust] Insert into
hive table physical operator. 633cebc [Michael Armbrust] better error
message when there is no appropriate planning strategy. 59ac444 [Michael
Armbrust] add unary expression 3aa1b28 [Michael Armbrust] support for table
names in the form 'database.tableName' 665f7d0 [Michael Armbrust] add
logical nodes for hive data sinks. 64d2923 [Michael Armbrust] Add classes
for representing sorts. f72b7ce [Michael Armbrust] first trivial end to end
query execution. 5c7d244 [Michael Armbrust] first draft of references
implementation. 7bff274 [Michael Armbrust] point at new shark. c7cd57f
[Michael Armbrust] docs for util function. 910811c [Michael Armbrust] check
each item of the sequence ef21a0b [Michael Armbrust] line up comments.
4b765d5 [Michael Armbrust] docs, drop println 6f9bafd [Michael Armbrust]
empty output for unresolved relation to avoid exception in resolution.
a703c49 [Michael Armbrust] this order works better until fixed point is
implemented. ec1d7c0 [Michael Armbrust] Simple attribute resolution.
069df02 [Michael Armbrust] parsing binary predicates a1cf754 [Michael
Armbrust] add joins and equality. 3f5bc98 [Michael Armbrust] add optiq to
sbt. 54f3460 [Michael Armbrust] initial optiq parsing. d9161ce [Michael
Armbrust] add join operator 1e423eb [Michael Armbrust] placeholders in
LogicalPlan, docs 24ef6fb [Michael Armbrust] toString for alias. ae7d776
[Michael Armbrust] add nullability changing function d49dc02 [Michael
Armbrust] scaladoc for named exprs 7c45dd7 [Michael Armbrust] pretty
printing of trees. 78e34bf [Michael Armbrust] simple git ignore. 7ba19be
[Michael Armbrust] First draft of interface to hive metastore. 7e7acf0
[Michael Armbrust] physical placeholder. 1c11136 [Michael Armbrust] first
draft of error handling / plans for debugging. 3766a41 [Michael Armbrust]
rearrange utility functions. 7fb3d5e [Michael Armbrust] docs and equality
improvements. 45da47b [Michael Armbrust] flesh out plans and expressions a
little. first cut at named expressions. 002d4d4 [Michael Armbrust] default
to no alias. be25003 [Michael Armbrust] add repl initialization to sbt.
0608a00 [Michael Armbrust] tighten public interface a1a8b38 [Michael
Armbrust] test that ids don't change for no-op transforms. daa71ca [Michael
Armbrust] foreach, maps, and scaladoc 6a158cb [Michael Armbrust] simple
transform working. db0299f [Michael Armbrust] basic analysis of relations
minus transform function. f74c4ee [Michael Armbrust] parsing a simple query.
08e4f57 [Michael Armbrust] upgrade scala include shark. d3c6404 [Michael
Armbrust] initial commit, 0, assembly, pom.xml, NULL, NULL, NULL,
9aadcffabd226557174f3ff566927f873c71672e, xml, null, 5, null, 0, 0]
java.nio.BufferOverflowException
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:189)
at java.nio.ByteBuffer.put(ByteBuffer.java:859)
at
org.apache.kylin.engine.mr.steps.FactDistinctHiveColumnsMapper.map(FactDistinctHiveColumnsMapper.java:145)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)