Hi,
In our custom datasource implementation, we want to inject some query level
information.
For example -
scala> val df = spark.sql("some query") // uses custom datasource under
the hood through Session Extensions.
scala> df.count // here we want some kind of pre execution hook just
before
Hi,
Consider the following statements:
1)
> scala> val df = spark.read.format("com.shubham.MyDataSource").load
> scala> df.show
> +---+---+
> | i| j|
> +---+---+
> | 0| 0|
> | 1| -1|
> | 2| -2|
> | 3| -3|
> | 4| -4|
> +---+---+
> 2)
> scala> val df1 = df.filter("i < 3")
> scala> df1.show
Hi,
I am using spark v2.3.2. I have an implementation of DSV2. Here is what is
happening:
1) Obtained a dataframe using MyDataSource
scala> val df1 = spark.read.format("com.shubham.MyDataSource").load
> MyDataSource.MyDataSource
> MyDataSource.createReader: Going to create a new MyDataSourceRead
explored SparkListener#onJobEnd, but then how to propagate some state
from DataSourceReader to SparkListener?
On Wed, Jun 12, 2019 at 2:22 PM Shubham Chaurasia
wrote:
> Hi All,
>
> Is there any way to receive some event that a DataSourceReader is
> finished?
> I want to do some clean u
Hi All,
Is there any way to receive some event that a DataSourceReader is finished?
I want to do some clean up after all the DataReaders are finished reading
and hence need some kind of cleanUp() mechanism at DataSourceReader(Driver)
level.
How to achieve this?
For instance, in DataSourceWriter
nt:* Tuesday, May 7, 2019 9:35:10 AM
> *To:* Shubham Chaurasia
> *Cc:* dev; user@spark.apache.org
> *Subject:* Re: Static partitioning in partitionBy()
>
> It depends on the data source. Delta Lake (https://delta.io) allows you
> to do it with the .option("replaceWhere",
Hi All,
Is there a way I can provide static partitions in partitionBy()?
Like:
df.write.mode("overwrite").format("MyDataSource").partitionBy("c=c1").save
Above code gives following error as it tries to find column `c=c1` in df.
org.apache.spark.sql.AnalysisException: Partition column `c=c1` not
pr 24, 2019 at 6:30 PM Shubham Chaurasia <
> shubh.chaura...@gmail.com> wrote:
>
>> Hi All,
>>
>> Consider the following(spark v2.4.0):
>>
>> Basically I change values of `spark.sql.session.timeZone` and perform an
>> orc write. Here are 3 samples:-
>>
Hi All,
Consider the following(spark v2.4.0):
Basically I change values of `spark.sql.session.timeZone` and perform an
orc write. Here are 3 samples:-
1)
scala> spark.conf.set("spark.sql.session.timeZone", "Asia/Kolkata")
scala> val df = sc.parallelize(Seq("2019-04-23
09:15:04.0")).toDF("ts").w
quot;internal" indicates that Spark
> doesn't need to convert the values.
>
> Spark's internal representation for a date is the ordinal from the unix
> epoch date, 1970-01-01 = 0.
>
> rb
>
> On Tue, Feb 5, 2019 at 4:46 AM Shubham Chaurasia <
> shu
Hi All,
I am using custom DataSourceV2 implementation (*Spark version 2.3.2*)
Here is how I am trying to pass in *date type *from spark shell.
scala> val df =
> sc.parallelize(Seq("2019-02-05")).toDF("datetype").withColumn("datetype",
> col("datetype").cast("date"))
> scala> df.write.format("com
?
>
>
>
> When I do:
>
> Val df = spark.read.format(mypackage).load().show()
>
> I am getting a single creation, how are you creating the reader?
>
>
>
> Thanks,
>
> Assaf
>
>
>
> *From:* Shubham Chaurasia [mailto:shubh.chaura...@gmail.c
tried to reproduce it in my
> environment and I am getting just one instance of the reader…
>
>
>
> Thanks,
>
> Assaf
>
>
>
> *From:* Shubham Chaurasia [mailto:shubh.chaura...@gmail.com]
> *Sent:* Tuesday, October 9, 2018 9:31 AM
> *To:* user@spark.apache.org
>
Hi All,
--Spark built with *tags/v2.4.0-rc2*
Consider following DataSourceReader implementation:
public class MyDataSourceReader implements DataSourceReader,
SupportsScanColumnarBatch {
StructType schema = null;
Map options;
public MyDataSourceReader(Map options) {
System.out.println
Hi All,
I built spark with *tags/v2.4.0-rc2* using
./build/mvn -DskipTests -Phadoop-2.7 -Dhadoop.version=3.1.0 clean install
Now from spark-shell when ever I call any static method residing in an
interface, it shows me error like :
:28: error: Static methods in interface require -target:jvm-1.8
15 matches
Mail list logo