Hello Rahul,
Please try to use df.filter(df("id").isin(1,2))
Thanks,
On Thu, Mar 30, 2017 at 10:45 PM, Rahul Nandi
wrote:
> Hi,
> I have around 2 million data as parquet file in s3. The file structure is
> somewhat like
> id data
> 1 abc
> 2 cdf
> 3 fas
> Now I want
Hello All,
I am working on creating a new PrunedFilteredScan operator which has the
ability to execute the predicates pushed to this operator.
However What I observed is that if column with deep in the hierarchy is
used then it is not getting pushed down.
SELECT tom._id, tom.address.city from
Hello All,
I am working on creating a new PrunedFilteredScan operator which has the
ability to execute the predicates pushed to this operator.
However What I observed is that if column with deep in the hierarchy is
used then it is not getting pushed down.
SELECT tom._id, tom.address.city from
Hello All,
I am working on creating a new PrunedFilteredScan operator which has the
ability to execute the predicates pushed to this operator.
However What I observed is that if column with deep in the hierarchy is
used then it is not getting pushed down.
SELECT tom._id, tom.address.city from
Hello All,
I am trying to test an application on standalone cluster. Here is my
scenario.
I started a spark master on a node A and also 1 worker on the same node A.
I am trying to run the application from node B(this means I think this acts
as driver).
I have added jars to the sparkconf using
Hello Aditya,
After an intermediate action has been applied you might want to call
rdd.unpersist() to let spark know that this rdd is no longer required.
Thanks,
-Hanu
On Thu, Sep 22, 2016 at 7:54 AM, Aditya
wrote:
> Hi,
>
> Suppose I have two RDDs
> val