Re: Apache Spark Integration

Josh Mahonin Wed, 19 Jul 2017 16:56:58 -0700

Hi Luqman,

At present, the phoenix-spark integration relies on the schema having been
already created.


There has been some discussion of augmenting the supported Spark
'SaveMode's to include 'CREATE IF NOT EXISTS' logic.

https://issues.apache.org/jira/browse/PHOENIX-2745
https://issues.apache.org/jira/browse/PHOENIX-2632

Contributions would be most welcome!

Josh


On Tue, Jul 18, 2017 at 6:50 AM, Luqman Ghani <[email protected]> wrote:

> Hi,
>
> I was wondering if phoenix-spark connector creates a new table if there
> doesn't exist one? Or do I have to create a table before calling
> saveToPhoenix function on a DataFrame? It is not evident from the above
> tests link provided by Ankit.
>
> Thanks,
> Luqman
>
> On Mon, Jul 17, 2017 at 11:23 PM, Luqman Ghani <[email protected]> wrote:
>
>> Thanks Ankit. I am sure this will help.
>>
>> On Mon, Jul 17, 2017 at 11:20 PM, Ankit Singhal <[email protected]
>> > wrote:
>>
>>> You can take a look at our IT tests for phoenix-spark module.
>>> https://github.com/apache/phoenix/blob/master/phoenix-spark/
>>> src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala
>>>
>>> On Mon, Jul 17, 2017 at 9:20 PM, Luqman Ghani <[email protected]> wrote:
>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Luqman Ghani <[email protected]>
>>>> Date: Sat, Jul 15, 2017 at 2:38 PM
>>>> Subject: Apache Spark Integration
>>>> To: [email protected]
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I am evaluating which approach to use for integrating Phoenix with
>>>> Spark, namely JDBC and phoenix-spark. I have one query regarding the
>>>> following point stated in limitations in Apache Spark Integration
>>>> <https://phoenix.apache.org/phoenix_spark.html> section:
>>>> "
>>>>
>>>>    - The Data Source API does not support passing custom Phoenix
>>>>    settings in configuration, you must create the DataFrame or RDD 
>>>> directly if
>>>>    you need fine-grained configuration.
>>>>
>>>> "
>>>>
>>>> Can someone point me to or give an example on how to give such
>>>> configuration?
>>>>
>>>> Also, it says in the docs
>>>> <https://phoenix.apache.org/phoenix_spark.html#Saving_DataFrames> that
>>>> there is a 'save' function to save a dataframe to a table. But there is
>>>> none. Instead, 'saveToPhoenix' shows up in my Intellij IDE suggestions. I'm
>>>> using phoenix-4.11.0-HBase-1.2 and Spark-2.0.2. Is this an error in docs?
>>>>
>>>> Thanks,
>>>> Luqman
>>>>
>>>>
>>>
>>
>

Re: Apache Spark Integration

Reply via email to