+1
On Wed, Mar 2, 2016 at 2:45 PM, Michael Armbrust
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.6.1!
>
> The vote is open until Saturday, March 5, 2016 at 20:00 UTC and passes if
> a majority of at least 3+1 PMC votes are cast.
>
Hi, all:
Sometimes task will fail with exception "About the exception "Received
LaunchTask command but executor was null", and I find it is a common problem:
https://issues.apache.org/jira/browse/SPARK-13112
https://issues.apache.org/jira/browse/SPARK-13060
I have a
SQL is very common and even some business analysts learn them. Scala and
Python are great, but the easiest language to use is often the languages a
user already knows. And for a lot of users, that is SQL.
On Wednesday, March 2, 2016, Jerry Lam wrote:
> Hi guys,
>
> FYI...
I see, we could reduce the memory by moving the copy out of the HashedRelation,
then we should do the copy before call HashedRelation for shuffle hash join.
Another things is that when we do broadcasting, we will have another
serialized copy
of hash table.
For the table that's larger than 100M,
Jay, thanks for the response.
Regarding the new consumer API for 0.9, I've been reading through the code
for it and thinking about how it fits in to the existing Spark integration.
So far I've seen some interesting challenges, and if you (or anyone else on
the dev list) have time to provide some
I would expect the memory pressure to grow because not only are we storing
the backing array to the iterator of the rows on the driver, but we’re
also storing a copy of each of those rows in the hash table. Whereas if we
didn’t do the copy on the drive side then the hash table would only have
to
UnsafeHashedRelation and HashedRelation could also be used in Executor
(for non-broadcast hash join), then the UnsafeRow could come from
UnsafeProjection,
so We should copy the rows for safety.
We could have a smarter copy() for UnsafeRow (avoid the copy if it's
already copied),
but I don't think
Hi guys,
FYI... this wiki page (StreamSQL: https://en.wikipedia.org/wiki/StreamSQL)
has some histories related Event Stream Processing and SQL.
Hi Steve,
It is difficult to ask your customers that they should learn a new language
when they are not programmers :)
I don't know where/why they
-dev +user
StructType(StructField(data,ArrayType(StructType(StructField(
> *stuff,ArrayType(*StructType(StructField(onetype,ArrayType(StructType(StructField(id,LongType,true),
> StructField(name,StringType,true)),true),true), StructField(othertype,
>
When you create a dataframe using the sqlContext.read.schema() API, if you pass
in a schema that's compatible with some of the records, but incompatible with
others, it seems you can't do a .select on the problematic columns, instead you
get an AnalysisException error.
I know loading the wrong
10 matches
Mail list logo