spark git commit: [SPARK-9871] [SPARKR] Add expression functions into SparkR which have a variable parameter

2015-08-16 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.5 90245f65c -> 78275c480 [SPARK-9871] [SPARKR] Add expression functions into SparkR which have a variable parameter ### Summary - Add `lit` function - Add `concat`, `greatest`, `least` functions I think we need to improve `collect` fun

spark git commit: [SPARK-9871] [SPARKR] Add expression functions into SparkR which have a variable parameter

2015-08-16 Thread shivaram
Repository: spark Updated Branches: refs/heads/master ae2370e72 -> 26e760581 [SPARK-9871] [SPARKR] Add expression functions into SparkR which have a variable parameter ### Summary - Add `lit` function - Add `concat`, `greatest`, `least` functions I think we need to improve `collect` functio

spark git commit: [SPARK-10005] [SQL] Fixes schema merging for nested structs

2015-08-16 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 e2c6ef810 -> 90245f65c [SPARK-10005] [SQL] Fixes schema merging for nested structs In case of schema merging, we only handled first level fields when converting Parquet groups to `InternalRow`s. Nested struct fields are not properly ha

spark git commit: [SPARK-10005] [SQL] Fixes schema merging for nested structs

2015-08-16 Thread yhuai
Repository: spark Updated Branches: refs/heads/master cf016075a -> ae2370e72 [SPARK-10005] [SQL] Fixes schema merging for nested structs In case of schema merging, we only handled first level fields when converting Parquet groups to `InternalRow`s. Nested struct fields are not properly handle

spark git commit: [SPARK-9973] [SQL] Correct in-memory columnar buffer size

2015-08-16 Thread lian
Repository: spark Updated Branches: refs/heads/branch-1.5 fa55c2742 -> e2c6ef810 [SPARK-9973] [SQL] Correct in-memory columnar buffer size The `initialSize` argument of `ColumnBuilder.initialize()` should be the number of rows rather than bytes. However `InMemoryColumnarTableScan` passes in a

spark git commit: [SPARK-10008] Ensure shuffle locality doesn't take precedence over narrow deps

2015-08-16 Thread matei
Repository: spark Updated Branches: refs/heads/branch-1.5 4f75ce2e1 -> fa55c2742 [SPARK-10008] Ensure shuffle locality doesn't take precedence over narrow deps The shuffle locality patch made the DAGScheduler aware of shuffle data, but for RDDs that have both narrow and shuffle dependencies, i

spark git commit: [SPARK-10008] Ensure shuffle locality doesn't take precedence over narrow deps

2015-08-16 Thread matei
Repository: spark Updated Branches: refs/heads/master 5f9ce738f -> cf016075a [SPARK-10008] Ensure shuffle locality doesn't take precedence over narrow deps The shuffle locality patch made the DAGScheduler aware of shuffle data, but for RDDs that have both narrow and shuffle dependencies, it ca

spark git commit: [SPARK-8844] [SPARKR] head/collect is broken in SparkR.

2015-08-16 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 182f9b7a6 -> 5f9ce738f [SPARK-8844] [SPARKR] head/collect is broken in SparkR. This is a WIP patch for SPARK-8844 for collecting reviews. This bug is about reading an empty DataFrame. in readCol(), lapply(1:numRows, function(x) { do

spark git commit: [SPARK-8844] [SPARKR] head/collect is broken in SparkR.

2015-08-16 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.5 881baf100 -> 4f75ce2e1 [SPARK-8844] [SPARKR] head/collect is broken in SparkR. This is a WIP patch for SPARK-8844 for collecting reviews. This bug is about reading an empty DataFrame. in readCol(), lapply(1:numRows, function(x)