Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

2017-09-26 Thread Николай Ижиков
Hello, Xin.

Thank you for an answer.

Is there any plans to make catalog API public?
Any specific release versions or dates?


2017-09-25 20:54 GMT+03:00 Reynold Xin :

> It's probably just an indication of lack of interest (or at least there
> isn't a substantial overlap between Ignite users and Spark users). A new
> catalog implementation is also pretty fundamental to Spark and the bar for
> that would be pretty high. See my comment in SPARK-17767.
>
> Guys - while I think this is very useful to do, I'm going to mark this as
> "later" for now. The reason is that there are a lot of things to consider
> before making this switch, including:
>
>- The ExternalCatalog API is currently internal, and we can't just
>make it public without thinking about the consequences and whether this API
>is maintainable in the long run.
>- SPARK-15777  We
>need to design this in the context of catalog federation and persistence.
>- SPARK-15691  
> Refactoring
>of how we integrate with Hive.
>
> This is not as simple as just submitting a PR to make it pluggable.
>
> On Mon, Sep 25, 2017 at 10:50 AM, Николай Ижиков 
> wrote:
>
>> Guys.
>>
>> Am I miss something and wrote a fully wrong mail?
>> Can you give me some feedback?
>> What I have missed with my propositions?
>>
>> 2017-09-19 15:39 GMT+03:00 Nikolay Izhikov :
>>
>>> Guys,
>>>
>>> Anyone had a chance to look at my message?
>>>
>>> 15.09.2017 15:50, Nikolay Izhikov пишет:
>>>
>>> Hello, guys.

 I’m contributor of Apache Ignite project which is self-described as an
 in-memory computing platform.

 It has Data Grid features: distribute, transactional key-value store
 [1], Distributed SQL support [2], etc…[3]

 Currently, I’m working on integration between Ignite and Spark [4]
 I want to add support of Spark Data Frame API for Ignite.

 As far as Ignite is distributed store it would be useful to create
 implementation of Catalog [5] for an Apache Ignite.

 I see two ways to implement this feature:

  1. Spark can provide API for any custom catalog implementation. As
 far as I can see there is a ticket for it [6]. It is closed with resolution
 “Later”. Is it suitable time to continue working on the ticket? How can I
 help with it?

  2. I can provide an implementation of Catalog and other required
 API in the form of pull request in Spark, as it was implemented for Hive
 [7]. Can such pull request be acceptable?

 Which way is more convenient for Spark community?

 [1] https://ignite.apache.org/features/datagrid.html
 [2] https://ignite.apache.org/features/sql.html
 [3] https://ignite.apache.org/features.html
 [4] https://issues.apache.org/jira/browse/IGNITE-3084
 [5] https://github.com/apache/spark/blob/master/sql/catalyst/src
 /main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 [6] https://issues.apache.org/jira/browse/SPARK-17767
 [7] https://github.com/apache/spark/blob/master/sql/hive/src/mai
 n/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala

>>>
>>
>>
>> --
>> Nikolay Izhikov
>> nizhikov@gmail.com
>>
>
>


-- 
Nikolay Izhikov
nizhikov@gmail.com


Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

2017-09-25 Thread Reynold Xin
It's probably just an indication of lack of interest (or at least there
isn't a substantial overlap between Ignite users and Spark users). A new
catalog implementation is also pretty fundamental to Spark and the bar for
that would be pretty high. See my comment in SPARK-17767.

Guys - while I think this is very useful to do, I'm going to mark this as
"later" for now. The reason is that there are a lot of things to consider
before making this switch, including:

   - The ExternalCatalog API is currently internal, and we can't just make
   it public without thinking about the consequences and whether this API is
   maintainable in the long run.
   - SPARK-15777  We
   need to design this in the context of catalog federation and persistence.
   - SPARK-15691 
Refactoring
   of how we integrate with Hive.

This is not as simple as just submitting a PR to make it pluggable.

On Mon, Sep 25, 2017 at 10:50 AM, Николай Ижиков 
wrote:

> Guys.
>
> Am I miss something and wrote a fully wrong mail?
> Can you give me some feedback?
> What I have missed with my propositions?
>
> 2017-09-19 15:39 GMT+03:00 Nikolay Izhikov :
>
>> Guys,
>>
>> Anyone had a chance to look at my message?
>>
>> 15.09.2017 15:50, Nikolay Izhikov пишет:
>>
>> Hello, guys.
>>>
>>> I’m contributor of Apache Ignite project which is self-described as an
>>> in-memory computing platform.
>>>
>>> It has Data Grid features: distribute, transactional key-value store
>>> [1], Distributed SQL support [2], etc…[3]
>>>
>>> Currently, I’m working on integration between Ignite and Spark [4]
>>> I want to add support of Spark Data Frame API for Ignite.
>>>
>>> As far as Ignite is distributed store it would be useful to create
>>> implementation of Catalog [5] for an Apache Ignite.
>>>
>>> I see two ways to implement this feature:
>>>
>>>  1. Spark can provide API for any custom catalog implementation. As
>>> far as I can see there is a ticket for it [6]. It is closed with resolution
>>> “Later”. Is it suitable time to continue working on the ticket? How can I
>>> help with it?
>>>
>>>  2. I can provide an implementation of Catalog and other required
>>> API in the form of pull request in Spark, as it was implemented for Hive
>>> [7]. Can such pull request be acceptable?
>>>
>>> Which way is more convenient for Spark community?
>>>
>>> [1] https://ignite.apache.org/features/datagrid.html
>>> [2] https://ignite.apache.org/features/sql.html
>>> [3] https://ignite.apache.org/features.html
>>> [4] https://issues.apache.org/jira/browse/IGNITE-3084
>>> [5] https://github.com/apache/spark/blob/master/sql/catalyst/src
>>> /main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
>>> [6] https://issues.apache.org/jira/browse/SPARK-17767
>>> [7] https://github.com/apache/spark/blob/master/sql/hive/src/mai
>>> n/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
>>>
>>
>
>
> --
> Nikolay Izhikov
> nizhikov@gmail.com
>


Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

2017-09-25 Thread Николай Ижиков
Guys.

Am I miss something and wrote a fully wrong mail?
Can you give me some feedback?
What I have missed with my propositions?

2017-09-19 15:39 GMT+03:00 Nikolay Izhikov :

> Guys,
>
> Anyone had a chance to look at my message?
>
> 15.09.2017 15:50, Nikolay Izhikov пишет:
>
> Hello, guys.
>>
>> I’m contributor of Apache Ignite project which is self-described as an
>> in-memory computing platform.
>>
>> It has Data Grid features: distribute, transactional key-value store [1],
>> Distributed SQL support [2], etc…[3]
>>
>> Currently, I’m working on integration between Ignite and Spark [4]
>> I want to add support of Spark Data Frame API for Ignite.
>>
>> As far as Ignite is distributed store it would be useful to create
>> implementation of Catalog [5] for an Apache Ignite.
>>
>> I see two ways to implement this feature:
>>
>>  1. Spark can provide API for any custom catalog implementation. As
>> far as I can see there is a ticket for it [6]. It is closed with resolution
>> “Later”. Is it suitable time to continue working on the ticket? How can I
>> help with it?
>>
>>  2. I can provide an implementation of Catalog and other required API
>> in the form of pull request in Spark, as it was implemented for Hive [7].
>> Can such pull request be acceptable?
>>
>> Which way is more convenient for Spark community?
>>
>> [1] https://ignite.apache.org/features/datagrid.html
>> [2] https://ignite.apache.org/features/sql.html
>> [3] https://ignite.apache.org/features.html
>> [4] https://issues.apache.org/jira/browse/IGNITE-3084
>> [5] https://github.com/apache/spark/blob/master/sql/catalyst/
>> src/main/scala/org/apache/spark/sql/catalyst/catalog/
>> ExternalCatalog.scala
>> [6] https://issues.apache.org/jira/browse/SPARK-17767
>> [7] https://github.com/apache/spark/blob/master/sql/hive/src/
>> main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
>>
>


-- 
Nikolay Izhikov
nizhikov@gmail.com


Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

2017-09-19 Thread Nikolay Izhikov

Guys,

Anyone had a chance to look at my message?

15.09.2017 15:50, Nikolay Izhikov пишет:

Hello, guys.

I’m contributor of Apache Ignite project which is self-described as an 
in-memory computing platform.


It has Data Grid features: distribute, transactional key-value store 
[1], Distributed SQL support [2], etc…[3]


Currently, I’m working on integration between Ignite and Spark [4]
I want to add support of Spark Data Frame API for Ignite.

As far as Ignite is distributed store it would be useful to create 
implementation of Catalog [5] for an Apache Ignite.


I see two ways to implement this feature:

     1. Spark can provide API for any custom catalog implementation. As 
far as I can see there is a ticket for it [6]. It is closed with 
resolution “Later”. Is it suitable time to continue working on the 
ticket? How can I help with it?


     2. I can provide an implementation of Catalog and other required 
API in the form of pull request in Spark, as it was implemented for Hive 
[7]. Can such pull request be acceptable?


Which way is more convenient for Spark community?

[1] https://ignite.apache.org/features/datagrid.html
[2] https://ignite.apache.org/features/sql.html
[3] https://ignite.apache.org/features.html
[4] https://issues.apache.org/jira/browse/IGNITE-3084
[5] 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala 


[6] https://issues.apache.org/jira/browse/SPARK-17767
[7] 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala 



-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

2017-09-15 Thread Nikolay Izhikov

Hello, guys.

I’m contributor of Apache Ignite project which is self-described as an 
in-memory computing platform.


It has Data Grid features: distribute, transactional key-value store 
[1], Distributed SQL support [2], etc…[3]


Currently, I’m working on integration between Ignite and Spark [4]
I want to add support of Spark Data Frame API for Ignite.

As far as Ignite is distributed store it would be useful to create 
implementation of Catalog [5] for an Apache Ignite.


I see two ways to implement this feature:

1. Spark can provide API for any custom catalog implementation. As 
far as I can see there is a ticket for it [6]. It is closed with 
resolution “Later”. Is it suitable time to continue working on the 
ticket? How can I help with it?


2. I can provide an implementation of Catalog and other required 
API in the form of pull request in Spark, as it was implemented for Hive 
[7]. Can such pull request be acceptable?


Which way is more convenient for Spark community?

[1] https://ignite.apache.org/features/datagrid.html
[2] https://ignite.apache.org/features/sql.html
[3] https://ignite.apache.org/features.html
[4] https://issues.apache.org/jira/browse/IGNITE-3084
[5] 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala

[6] https://issues.apache.org/jira/browse/SPARK-17767
[7] 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org