Repository: spark
Updated Branches:
refs/heads/branch-1.6 ab2a124c8 -> 1cf9d3858
[SPARK-12030] Fix Platform.copyMemory to handle overlapping regions.
This bug was exposed as memory corruption in Timsort which uses copyMemory to
copy
large regions that can overlap. The prior implementation
Repository: spark
Updated Branches:
refs/heads/master 34e7093c1 -> 2cef1cdfb
[SPARK-12030] Fix Platform.copyMemory to handle overlapping regions.
This bug was exposed as memory corruption in Timsort which uses copyMemory to
copy
large regions that can overlap. The prior implementation did
Repository: spark
Updated Branches:
refs/heads/branch-1.6 1cf9d3858 -> 81db8d086
[SPARK-12004] Preserve the RDD partitioner through RDD checkpointing
The solution is the save the RDD partitioner in a separate file in the RDD
checkpoint directory. That is, `/_partitioner`. In most cases,
Repository: spark
Updated Branches:
refs/heads/master 2cef1cdfb -> 60b541ee1
[SPARK-12004] Preserve the RDD partitioner through RDD checkpointing
The solution is the save the RDD partitioner in a separate file in the RDD
checkpoint directory. That is, `/_partitioner`. In most cases,
Repository: spark
Updated Branches:
refs/heads/branch-1.6 21909b8ac -> 5647774b0
[SPARK-11961][DOC] Add docs of ChiSqSelector
https://issues.apache.org/jira/browse/SPARK-11961
Author: Xusen Yin
Closes #9965 from yinxusen/SPARK-11961.
(cherry picked from commit
Repository: spark
Updated Branches:
refs/heads/master 328b757d5 -> e76431f88
[SPARK-11961][DOC] Add docs of ChiSqSelector
https://issues.apache.org/jira/browse/SPARK-11961
Author: Xusen Yin
Closes #9965 from yinxusen/SPARK-11961.
Project:
Repository: spark
Updated Branches:
refs/heads/master 47a0abc34 -> 5a8b5fdd6
[SPARK-11788][SQL] surround timestamp/date value with quotes in JDBC data source
When query the Timestamp or Date column like the following
val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg &&
Repository: spark
Updated Branches:
refs/heads/master ef6790fdc -> 47a0abc34
[SPARK-11328][SQL] Improve error message when hitting this issue
The issue is that the output commiter is not idempotent and retry attempts will
fail because the output file already exists. It is not safe to clean up
Repository: spark
Updated Branches:
refs/heads/master 5a8b5fdd6 -> 5872a9d89
[SPARK-11352][SQL] Escape */ in the generated comments.
https://issues.apache.org/jira/browse/SPARK-11352
Author: Yin Huai
Closes #10072 from yhuai/SPARK-11352.
Project:
Repository: spark
Updated Branches:
refs/heads/branch-1.6 d77bf0bd9 -> f1122dd2b
[SPARK-11328][SQL] Improve error message when hitting this issue
The issue is that the output commiter is not idempotent and retry attempts will
fail because the output file already exists. It is not safe to
Repository: spark
Updated Branches:
refs/heads/master f292018f8 -> ef6790fdc
[SPARK-12075][SQL] Speed up HiveComparisionTest by avoiding / speeding up
TestHive.reset()
When profiling HiveCompatibilitySuite, I noticed that most of the time seems to
be spent in expensive `TestHive.reset()`
Repository: spark
Updated Branches:
refs/heads/branch-1.6 012de2ce5 -> d77bf0bd9
[SPARK-12075][SQL] Speed up HiveComparisionTest by avoiding / speeding up
TestHive.reset()
When profiling HiveCompatibilitySuite, I noticed that most of the time seems to
be spent in expensive
Repository: spark
Updated Branches:
refs/heads/branch-1.5 80dac0b07 -> f28399e1a
[SPARK-11328][SQL] Improve error message when hitting this issue
The issue is that the output commiter is not idempotent and retry attempts will
fail because the output file already exists. It is not safe to
Repository: spark
Updated Branches:
refs/heads/branch-1.6 81db8d086 -> 21909b8ac
Revert "[SPARK-12060][CORE] Avoid memory copy in
JavaSerializerInstance.serialize"
This reverts commit 9b99b2b46c452ba396e922db5fc7eec02c45b158.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Repository: spark
Updated Branches:
refs/heads/branch-1.5 fc3fb8463 -> 7460e4309
[SPARK-12030] Fix Platform.copyMemory to handle overlapping regions.
This bug was exposed as memory corruption in Timsort which uses copyMemory to
copy
large regions that can overlap. The prior implementation
Repository: spark
Updated Branches:
refs/heads/master 60b541ee1 -> 328b757d5
Revert "[SPARK-12060][CORE] Avoid memory copy in
JavaSerializerInstance.serialize"
This reverts commit 1401166576c7018c5f9c31e0a6703d5fb16ea339.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/branch-1.6 5647774b0 -> 012de2ce5
[SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery
issue
Fixed a minor race condition in #10017
Closes #10017
Author: jerryshao
Author: Shixiong Zhu
Repository: spark
Updated Branches:
refs/heads/master e76431f88 -> f292018f8
[SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery
issue
Fixed a minor race condition in #10017
Closes #10017
Author: jerryshao
Author: Shixiong Zhu
Repository: spark
Updated Branches:
refs/heads/branch-1.6 1135430a0 -> 14eadf921
[SPARK-11352][SQL] Escape */ in the generated comments.
https://issues.apache.org/jira/browse/SPARK-11352
Author: Yin Huai
Closes #10072 from yhuai/SPARK-11352.
(cherry picked from
Repository: spark
Updated Branches:
refs/heads/branch-1.6 f1122dd2b -> 1135430a0
[SPARK-11788][SQL] surround timestamp/date value with quotes in JDBC data source
When query the Timestamp or Date column like the following
val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg &&
Repository: spark
Updated Branches:
refs/heads/branch-1.5 f28399e1a -> fc3fb8463
[SPARK-11788][SQL] surround timestamp/date value with quotes in JDBC data source
When query the Timestamp or Date column like the following
val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg &&
Repository: spark
Updated Branches:
refs/heads/master 5872a9d89 -> e96a70d5a
[SPARK-11596][SQL] In TreeNode's argString, if a TreeNode is not a child of the
current TreeNode, we should only return the simpleString.
In TreeNode's argString, if a TreeNode is not a child of the current
Repository: spark
Updated Branches:
refs/heads/branch-1.6 14eadf921 -> 1b3db967e
[SPARK-11596][SQL] In TreeNode's argString, if a TreeNode is not a child of the
current TreeNode, we should only return the simpleString.
In TreeNode's argString, if a TreeNode is not a child of the current
Repository: spark
Updated Branches:
refs/heads/branch-1.6 add4e6311 -> 9b99b2b46
[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize
`JavaSerializerInstance.serialize` uses `ByteArrayOutputStream.toByteArray` to
get the serialized data.
Repository: spark
Updated Branches:
refs/heads/master c87531b76 -> 140116657
[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize
`JavaSerializerInstance.serialize` uses `ByteArrayOutputStream.toByteArray` to
get the serialized data. `ByteArrayOutputStream.toByteArray`
Repository: spark
Updated Branches:
refs/heads/master 140116657 -> 69dbe6b40
[SPARK-12046][DOC] Fixes various ScalaDoc/JavaDoc issues
This PR backports PR #10039 to master
Author: Cheng Lian
Closes #10063 from liancheng/spark-12046.doc-fix.master.
Project:
Repository: spark
Updated Branches:
refs/heads/master 69dbe6b40 -> 8ddc55f1d
[SPARK-12068][SQL] use a single column in Dataset.groupBy and count will fail
The reason is that, for a single culumn `RowEncoder`(or a single field product
encoder), when we use it as the encoder for grouping key,
Repository: spark
Updated Branches:
refs/heads/branch-1.6 9b99b2b46 -> 6e3e3c648
[SPARK-12068][SQL] use a single column in Dataset.groupBy and count will fail
The reason is that, for a single culumn `RowEncoder`(or a single field product
encoder), when we use it as the encoder for grouping
Repository: spark
Updated Branches:
refs/heads/branch-1.6 6e3e3c648 -> 74a230676
[SPARK-11856][SQL] add type cast if the real type is different but compatible
with encoder schema
When we build the `fromRowExpression` for an encoder, we set up a lot of
"unresolved" stuff and lost the
Repository: spark
Updated Branches:
refs/heads/master 0f37d1d7e -> 4375eb3f4
[SPARK-12090] [PYSPARK] consider shuffle in coalesce()
Author: Davies Liu
Closes #10090 from davies/fix_coalesce.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/branch-1.5 0d57a4ae1 -> ed7264ba2
[SPARK-12090] [PYSPARK] consider shuffle in coalesce()
Author: Davies Liu
Closes #10090 from davies/fix_coalesce.
(cherry picked from commit 4375eb3f48fc7ae90caf6c21a0d3ab0b66bf4efa)
Repository: spark
Updated Branches:
refs/heads/branch-1.6 3c4938e26 -> c47a7373a
[SPARK-12090] [PYSPARK] consider shuffle in coalesce()
Author: Davies Liu
Closes #10090 from davies/fix_coalesce.
(cherry picked from commit 4375eb3f48fc7ae90caf6c21a0d3ab0b66bf4efa)
Repository: spark
Updated Branches:
refs/heads/branch-1.6 72da2a21f -> 84c44b500
[SPARK-12081] Make unified memory manager work with small heaps
The existing `spark.memory.fraction` (default 0.75) gives the system 25% of the
space to work with. For small heaps, this is not enough: e.g.
Repository: spark
Updated Branches:
refs/heads/branch-1.6 1b3db967e -> 72da2a21f
[SPARK-8414] Ensure context cleaner periodic cleanups
Garbage collection triggers cleanups. If the driver JVM is huge and there is
little memory pressure, we may never clean up shuffle files on executors. This
Repository: spark
Updated Branches:
refs/heads/branch-1.5 7460e4309 -> 4f07a590c
[SPARK-11352][SQL][BRANCH-1.5] Escape */ in the generated comments.
https://issues.apache.org/jira/browse/SPARK-11352
This one backports https://github.com/apache/spark/pull/10072 to branch 1.5.
Author: Yin
Repository: spark
Updated Branches:
refs/heads/branch-1.6 84c44b500 -> a5743affc
[SPARK-12077][SQL] change the default plan for single distinct
Use try to match the behavior for single distinct aggregation with Spark 1.5,
but that's not scalable, we should be robust by default, have a flag
Repository: spark
Updated Branches:
refs/heads/master d96f8c997 -> 96691feae
[SPARK-12077][SQL] change the default plan for single distinct
Use try to match the behavior for single distinct aggregation with Spark 1.5,
but that's not scalable, we should be robust by default, have a flag to
Repository: spark
Updated Branches:
refs/heads/master 96691feae -> 8a75a3049
[SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles
The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently
in multiple places:
* The JobConf is updated by
Repository: spark
Updated Branches:
refs/heads/branch-1.5 4f07a590c -> 0d57a4ae1
[SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles
The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently
in multiple places:
* The JobConf is updated by
Repository: spark
Updated Branches:
refs/heads/branch-1.6 a5743affc -> 1f42295b5
[SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles
The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently
in multiple places:
* The JobConf is updated by
Repository: spark
Updated Branches:
refs/heads/branch-1.6 1f42295b5 -> 3c4938e26
[SPARK-11949][SQL] Check bitmasks to set nullable property
Following up #10038.
We can use bitmasks to determine which grouping expressions need to be set as
nullable.
cc yhuai
Author: Liang-Chi Hsieh
Repository: spark
Updated Branches:
refs/heads/branch-1.4 f5af299ab -> b6ba2dab2
[SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles
The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently
in multiple places:
* The JobConf is updated by
Repository: spark
Updated Branches:
refs/heads/master 8a75a3049 -> 0f37d1d7e
[SPARK-11949][SQL] Check bitmasks to set nullable property
Following up #10038.
We can use bitmasks to determine which grouping expressions need to be set as
nullable.
cc yhuai
Author: Liang-Chi Hsieh
Repository: spark
Updated Branches:
refs/heads/master 9df24624a -> fd95eeaf4
[SPARK-11954][SQL] Encoder for JavaBeans
create java version of `constructorFor` and `extractorFor` in
`JavaTypeInference`
Author: Wenchen Fan
This patch had conflicts when merged,
Repository: spark
Updated Branches:
refs/heads/branch-1.6 74a230676 -> 88bbce008
[SPARK-11954][SQL] Encoder for JavaBeans
create java version of `constructorFor` and `extractorFor` in
`JavaTypeInference`
Author: Wenchen Fan
This patch had conflicts when merged,
Repository: spark
Updated Branches:
refs/heads/branch-1.6 88bbce008 -> 40769b48c
[SPARK-11905][SQL] Support Persist/Cache and Unpersist in Dataset APIs
Persist and Unpersist exist in both RDD and Dataframe APIs. I think they are
still very critical in Dataset APIs. Not sure if my
Repository: spark
Updated Branches:
refs/heads/master fd95eeaf4 -> 0a7bca2da
[SPARK-11905][SQL] Support Persist/Cache and Unpersist in Dataset APIs
Persist and Unpersist exist in both RDD and Dataframe APIs. I think they are
still very critical in Dataset APIs. Not sure if my understanding
Repository: spark
Updated Branches:
refs/heads/branch-1.6 40769b48c -> 843a31afb
[SPARK-12046][DOC] Fixes various ScalaDoc/JavaDoc issues
This PR backports PR #10039 to master
Author: Cheng Lian
Closes #10063 from liancheng/spark-12046.doc-fix.master.
(cherry picked
Repository: spark
Updated Branches:
refs/heads/branch-1.6 843a31afb -> 99dc1335e
[SPARK-11821] Propagate Kerberos keytab for all environments
andrewor14 the same PR as in branch 1.5
harishreedharan
Author: woj-i
Closes #9859 from woj-i/master.
(cherry picked from
Repository: spark
Updated Branches:
refs/heads/master 0a7bca2da -> 6a8cf80cc
[SPARK-11821] Propagate Kerberos keytab for all environments
andrewor14 the same PR as in branch 1.5
harishreedharan
Author: woj-i
Closes #9859 from woj-i/master.
Project:
Repository: spark
Updated Branches:
refs/heads/branch-1.6 99dc1335e -> ab2a124c8
[SPARK-12065] Upgrade Tachyon from 0.8.1 to 0.8.2
This commit upgrades the Tachyon dependency from 0.8.1 to 0.8.2.
Author: Josh Rosen
Closes #10054 from
Repository: spark
Updated Branches:
refs/heads/master 6a8cf80cc -> 34e7093c1
[SPARK-12065] Upgrade Tachyon from 0.8.1 to 0.8.2
This commit upgrades the Tachyon dependency from 0.8.1 to 0.8.2.
Author: Josh Rosen
Closes #10054 from
Repository: spark
Updated Branches:
refs/heads/branch-1.5 d78f1bc45 -> 80dac0b07
Set SPARK_EC2_VERSION to 1.5.2
Author: Alexander Pivovarov
Closes #10064 from apivovarov/patch-1.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master 9693b0d5a -> a0af0e351
[SPARK-11898][MLLIB] Use broadcast for the global tables in Word2Vec
jira: https://issues.apache.org/jira/browse/SPARK-11898
syn0Global and sync1Global in word2vec are quite large objects with size (vocab
*
Repository: spark
Updated Branches:
refs/heads/master a0af0e351 -> c87531b76
[SPARK-11949][SQL] Set field nullable property for GroupingSets to get correct
results for null values
JIRA: https://issues.apache.org/jira/browse/SPARK-11949
The result of cube plan uses incorrect schema. The
55 matches
Mail list logo