Re: [ANNOUNCEMENT] September 2019 Bay Area Apache Flink Meetup

2019-09-20 Thread Xuefu Zhang
Hi all,

Happy Friday!

As a kind reminder, the meetup is ON next Tuesday at Yelp HQ in San
Francisco. See you all there at 6:30pm.

Regards,
Xuefu

On Fri, Aug 30, 2019 at 11:44 AM Xuefu Zhang  wrote:

> Hi all,
>
> As promised, we planned to have quarterly Flink meetup and now it's about
> the time. Thus, I'm happy to announce that the next Bay Area Apache Flink
> Meetup [1] is scheduled on Sept. 24 at Yelp, 140 New Montgomery in San
> Francisco.
>
> Schedule:
>
> 6:30 - Door open
> 6:30 - 7:00 PM Networking and Refreshments
> 7:00 - 8:30 PM Short talks
>
> -- Two years of Flink @ Yelp (Enrico Canzonieri, 30m)
> -- How BNP Paribas Fortis uses Flink for real-time fraud detectionDavid
> Massart (David Massart, tentative)
>
> Please refer to the meetup page [1] for more details.
>
> Many thanks go to Yelp for their sponsorship. At the same time, we might
> still have room for one more short talk. Please let me know if interested.
>
>
> Thanks,
>
> Xuefu
>
> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262680261/
>
>


[jira] [Created] (FLINK-14005) Support Hive version 2.2.0

2019-09-08 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-14005:
---

 Summary: Support Hive version 2.2.0
 Key: FLINK-14005
 URL: https://issues.apache.org/jira/browse/FLINK-14005
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


Including 2.0.0 and 2.0.1.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13998) Fix ORC test failure with Hive 2.0.x

2019-09-06 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13998:
---

 Summary: Fix ORC test failure with Hive 2.0.x
 Key: FLINK-13998
 URL: https://issues.apache.org/jira/browse/FLINK-13998
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


Including 2.0.0 and 2.0.1.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13947) Check Hive shim serialization in Hive UDF wrapper classes and test coverage

2019-09-03 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13947:
---

 Summary: Check Hive shim serialization in Hive UDF wrapper classes 
and test coverage
 Key: FLINK-13947
 URL: https://issues.apache.org/jira/browse/FLINK-13947
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


Including 3.1.0, 3.1.1, and 3.1.2.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13931) Support Hive version 2.0.x

2019-08-30 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13931:
---

 Summary: Support Hive version 2.0.x
 Key: FLINK-13931
 URL: https://issues.apache.org/jira/browse/FLINK-13931
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


Including 3.1.0, 3.1.1, and 3.1.2.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[ANNOUNCEMENT] September 2019 Bay Area Apache Flink Meetup

2019-08-30 Thread Xuefu Zhang
Hi all,

As promised, we planned to have quarterly Flink meetup and now it's about
the time. Thus, I'm happy to announce that the next Bay Area Apache Flink
Meetup [1] is scheduled on Sept. 24 at Yelp, 140 New Montgomery in San
Francisco.

Schedule:

6:30 - Door open
6:30 - 7:00 PM Networking and Refreshments
7:00 - 8:30 PM Short talks

-- Two years of Flink @ Yelp (Enrico Canzonieri, 30m)
-- How BNP Paribas Fortis uses Flink for real-time fraud detectionDavid
Massart (David Massart, tentative)

Please refer to the meetup page [1] for more details.

Many thanks go to Yelp for their sponsorship. At the same time, we might
still have room for one more short talk. Please let me know if interested.


Thanks,

Xuefu

[1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262680261/


[jira] [Created] (FLINK-13930) Support Hive version 3.1.x

2019-08-30 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13930:
---

 Summary: Support Hive version 3.1.x
 Key: FLINK-13930
 URL: https://issues.apache.org/jira/browse/FLINK-13930
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


Hive 2.3.6 is released a few days ago. We can trivially support this version as 
well, as we have already provided support for previous 2.3.x releases.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13903) Support Hive version 2.3.6

2019-08-29 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13903:
---

 Summary: Support Hive version 2.3.6
 Key: FLINK-13903
 URL: https://issues.apache.org/jira/browse/FLINK-13903
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.10.0


This is to support all 1.2 (1.2.0, 1.2.1, 1.2.2) and 2.3 (2.3.0-5) versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13877) Support Hive version 2.1.0 and 2.1.1

2019-08-27 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13877:
---

 Summary: Support Hive version 2.1.0 and 2.1.1
 Key: FLINK-13877
 URL: https://issues.apache.org/jira/browse/FLINK-13877
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
 Fix For: 1.10.0


This is to support all 1.2 (1.2.0, 1.2.1, 1.2.2) and 2.3 (2.3.0-5) versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13841) Extend Hive version support to all 1.2 and 2.3 versions

2019-08-23 Thread Xuefu Zhang (Jira)
Xuefu Zhang created FLINK-13841:
---

 Summary: Extend Hive version support to all 1.2 and 2.3 versions
 Key: FLINK-13841
 URL: https://issues.apache.org/jira/browse/FLINK-13841
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Xuefu Zhang
 Fix For: 1.10.0


This is to support all 1.2 (1.2.0, 1.2.1, 1.2.2) and 2.3 (2.3.0-5) versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13643) Document the workaround for users with a different minor Hive version

2019-08-07 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13643:
---

 Summary: Document the workaround for users with a different minor 
Hive version
 Key: FLINK-13643
 URL: https://issues.apache.org/jira/browse/FLINK-13643
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive, Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
 Fix For: 1.9.0


We officially support two Hive versions. However, we can tell user how to work 
around the limitation if their Hive version is only minorly differently.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13569) DDL table property key is defined as indentifier but should be string literal instead

2019-08-04 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13569:
---

 Summary: DDL table property key is defined as indentifier but 
should be string literal instead
 Key: FLINK-13569
 URL: https://issues.apache.org/jira/browse/FLINK-13569
 Project: Flink
  Issue Type: Bug
Reporter: Xuefu Zhang


The key name should be any free text, and should not be constrained by the 
identifier grammar.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13568) DDL create table doesn't allow STRING data type

2019-08-04 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13568:
---

 Summary: DDL create table doesn't allow STRING data type
 Key: FLINK-13568
 URL: https://issues.apache.org/jira/browse/FLINK-13568
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang


Creating a table with "string" data type fails with tableEnv.sqlUpdate().



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13501) Fixes a few issues in documentation for Hive integration

2019-07-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13501:
---

 Summary: Fixes a few issues in documentation for Hive integration
 Key: FLINK-13501
 URL: https://issues.apache.org/jira/browse/FLINK-13501
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive, Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
 Fix For: 1.9.0, 1.10.0


Going thru existing Hive doc I found the following issues that should be 
addressed:
1. Section "Hive Integration" should come after "SQL client" (at the same 
level).
2. In Catalog section, there are headers named "Hive Catalog". Also, some 
information is duplicated with that in "Hive Integration"
3. "Data Type Mapping" is Hive specific and should probably move to "Hive 
integration"



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13443) Support/Map Hive INTERVAL type in Hive connector

2019-07-26 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13443:
---

 Summary: Support/Map Hive INTERVAL type in Hive connector
 Key: FLINK-13443
 URL: https://issues.apache.org/jira/browse/FLINK-13443
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive, Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
 Fix For: 1.10.0


The issue comes up when visiting FLINK-13385. We need to figure out how we map 
between Flink's INTERVAL type and Hive's.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] Support temporary tables in SQL API

2019-07-22 Thread Xuefu Zhang
Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
that for 1.9 we should have some quick and simple solution, leaving time
for more thorough discussions for 1.10.

In particular, I'm not fully with solution #1. For one thing, it seems
proposing storing all temporary objects in a memory map in CatalogManager,
and the memory map duplicates the functionality of the in-memory catalog,
which also store temporary objects. For another, as pointed out by the
google doc, different db may handle the temporary tables differently, and
accordingly it may make more sense to let each catalog to handle its
temporary objects.

Therefore, postponing the fix buys us time to flush out all the details.

Thanks,
Xuefu

On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:

> Thanks for summarizing our offline discussion Dawid! Even though I would
> prefer solution 1 instead of releasing half-baked features, I also
> understand that the Table API should not further block the next release.
> Therefore, I would be fine with solution 3 but introduce the new
> user-facing `createTemporaryTable` methods as synonyms of the existing
> ones already. This allows us to deprecate the methods with undefined
> behavior as early as possible.
>
> Thanks,
> Timo
>
>
> Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:
> > Hi all,
> >
> > When working on FLINK-13279[1] we realized we could benefit from a
> > better temporary objects support in the Catalog API/Table API.
> > Unfortunately we are already long past the feature freeze that's why I
> > wanted to get some opinions from the community how should we proceed
> > with this topic. I tried to prepare a summary of the current state and 3
> > different suggested approaches that we could take. Please see the
> > attached document[2]
> >
> > I will appreciate your thoughts!
> >
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-13279
> >
> > [2]
> >
> https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing
> >
> >
>
>


[ANNOUNCE] Feature freeze for Apache Flink 1.9.0 release

2019-07-05 Thread Xuefu Zhang
On Hive integration, we are still actively working on Hive UDF support.
There are still some dependent items  being developed by other team but we
still like to complete our feature as planned.

Thanks,
Xuefu


-- Forwarded message -
From: Kurt Young 
Date: Fri, Jul 5, 2019 at 7:06 AM
Subject: Re: [ANNOUNCE] Feature freeze for Apache Flink 1.9.0 release
To: dev 
Cc: Tzu-Li (Gordon) Tai 


Here are the features I collected which are under actively developing and
close
to merge:

1. Bridge blink planner to unified table environment and remove TableConfig
from blink planner
2. Support timestamp with local time zone and partition pruning in blink
planner
3. Support JDBC & HBase lookup function and upsert sink
4. StreamExecutionEnvironment supports executing job with StreamGraph, and
blink planner should set proper properties to StreamGraph
5. Set resource profiles to task and enable managed memory as resource
profile

Best,
Kurt


On Fri, Jul 5, 2019 at 9:37 PM Kurt Young  wrote:

> Hi devs,
>
> It's July 5 now and we should announce feature freeze and cut the branch
> as planned. However, some components seems still not ready yet and
> various features are still under development or review.
>
> But we also can not extend the freeze day again which will further delay
> the
> release date. I think freeze new features today and have another couple
> of buffer days, letting features which are almost ready have a chance to
> get in is a reasonable solution.
>
> I hereby announce features of Flink 1.9.0 are freezed, *July 11* will be
> the
> day for cutting branch.  Since the feature freeze has effectively took
> place,
> I kindly ask committers to refrain from merging features that are planned
> for
> future releases into the master branch for the time being before the 1.9
> branch
> is cut. We understand this might be a bit inconvenient, thanks for the
> cooperation here.
>
> Best,
> Kurt
>
>
> On Fri, Jul 5, 2019 at 5:19 PM 罗齐  wrote:
>
>> Hi Gordon,
>>
>> Will branch 1.9 be cut out today? We're really looking forward to the
>> blink features in 1.9.
>>
>> Thanks,
>> Qi
>>
>> On Wed, Jun 26, 2019 at 7:18 PM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> Thanks for the updates so far everyone!
>>>
>>> Since support for the new Blink-based Table / SQL runner and
fine-grained
>>> recovery are quite prominent features for 1.9.0,
>>> and developers involved in these topics have already expressed that
these
>>> could make good use for another week,
>>> I think it definitely makes sense to postpone the feature freeze.
>>>
>>> The new date for feature freeze and feature branch cut for 1.9.0 will be
>>> *July
>>> 5*.
>>>
>>> Please update on this thread if there are any further concerns!
>>>
>>> Cheers,
>>> Gordon
>>>
>>> On Tue, Jun 25, 2019 at 9:05 PM Chesnay Schepler 
>>> wrote:
>>>
>>> > On the fine-grained recovery / batch scheduling side we could make
good
>>> > use of another week.
>>> > Currently we are on track to have the _feature_ merged, but without
>>> > having done a great deal of end-to-end testing.
>>> >
>>> > On 25/06/2019 15:01, Kurt Young wrote:
>>> > > Hi Aljoscha,
>>> > >
>>> > > I also feel an additional week can make the remaining work more
>>> easy. At
>>> > > least
>>> > > we don't have to check in lots of commits in both branches (master &
>>> > > release-1.9).
>>> > >
>>> > > Best,
>>> > > Kurt
>>> > >
>>> > >
>>> > > On Tue, Jun 25, 2019 at 8:27 PM Aljoscha Krettek <
>>> aljos...@apache.org>
>>> > > wrote:
>>> > >
>>> > >> A few threads are converging around supporting the new Blink-based
>>> Table
>>> > >> API Runner/Planner. I think hitting the currently proposed feature
>>> > freeze
>>> > >> date is hard, if not impossible, and that the work would benefit
>>> from an
>>> > >> additional week to get everything in with good quality.
>>> > >>
>>> > >> What do the others involved in the topic think?
>>> > >>
>>> > >> Aljoscha
>>> > >>
>>> > >>> On 24. Jun 2019, at 19:42, Bowen Li  wrote:
>>> > >>>
>>> > >>> Hi Gordon,
>>> > >>>
>>> > >>> Thanks for driving this effort.
>>> > >>>
>>> > >>> Xuefu responded to the discussion thread [1] and I want to bring
>>> that
>>> > to
>>> > >>> our attention here:
>>> > >>>
>>> > >>> Hive integration depends on a few features that are actively
>>> developed.
>>> > >> If
>>> > >>> the completion of those features don't leave enough time for us to
>>> > >>> integrate, then our work can potentially go beyond the proposed
>>> date.
>>> > >>>
>>> > >>> Just wanted to point out such a dependency adds uncertainty.
>>> > >>>
>>> > >>> [1]
>>> > >>>
>>> > >>
>>> >
>>>
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-9-0-td28701.html
>>> > >>> On Thu, Jun 20, 2019 at 1:05 AM Tzu-Li (Gordon) Tai <
>>> > tzuli...@apache.org
>>> > >>>
>>> > >>> wrote:
>>> > >>>
>>> >  Hi devs,
>>> > 
>>> >  Per the feature discussions for 1.9.0 [1], I hereby announce the
>>> > >> official
>>> >  feature freeze for 

[jira] [Created] (FLINK-13047) Fix the Optional.orElse() usage issue in DatabaseCalciteSchema

2019-07-01 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13047:
---

 Summary: Fix the Optional.orElse() usage issue in 
DatabaseCalciteSchema 
 Key: FLINK-13047
 URL: https://issues.apache.org/jira/browse/FLINK-13047
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


It's found that Optional.orElse() will evaluate the argument first before 
returning Optional.get(). If the evaluation throws an exception then the call 
fails even if the Optional object is nonempty. This the case In 
{{DatabaseCalciteSchema.convertCatalogTable()}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-13023) Generate HiveTableSource from from a Hive table

2019-06-27 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-13023:
---

 Summary: Generate HiveTableSource from from a Hive table
 Key: FLINK-13023
 URL: https://issues.apache.org/jira/browse/FLINK-13023
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


As a followup for FLINK-11480, this adds the conversion from a Hive table to a 
table sink that's used for data connector writing side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCEMENT] June 2019 Bay Area Apache Flink Meetup

2019-06-26 Thread Xuefu Zhang
Hi all,

As a gentle reminder, the meetup [1] will be on today at 6:30pm at Zendesk,
1019 Market Street, SF. Come on in for enlightening talks as well as foods
and drinks.

See you there!

Regards,
Xuefu

[1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262216929/


On Fri, Jun 21, 2019 at 9:32 AM Xuefu Zhang  wrote:

> Hi all,
>
> As the event is around the corner. If you haven't responded, please RSVP
> at meetup.com. Otherwise, I will see you next Wednesday, June 26.
>
> Regards,
> Xuefu
>
> On Mon, Jun 10, 2019 at 7:50 PM Xuefu Zhang  wrote:
>
>> Hi all,
>>
>> As promised, we planned to have quarterly Flink meetup and now it's about
>> the time. Thus, I'm happy to announce that the next Bay Area Apache Flink
>> Meetup [1] is scheduled on June 26 at Zendesk, 1019 Market Street in San
>> Francisco.
>>
>> Schedule:
>>
>> 6:30 - 7:00 PM Networking and Refreshments
>> 7:00 - 8:30 PM Short talks
>>
>> Many thanks go to Zendesk for their sponsorship. At the same time, we are
>> open to 2 or 3 short talks. If interested, please let me know.
>>
>> Thanks,
>>
>> Xuefu
>>
>> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262216929
>>>
>>


[jira] [Created] (FLINK-12989) Add implementation of converting Hive catalog table to Hive table sink

2019-06-25 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12989:
---

 Summary: Add implementation of converting Hive catalog table to 
Hive table sink
 Key: FLINK-12989
 URL: https://issues.apache.org/jira/browse/FLINK-12989
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


As a followup for FLINK-11480, this adds the conversion from a Hive table to a 
table sink that's used for data connector writing side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCEMENT] June 2019 Bay Area Apache Flink Meetup

2019-06-21 Thread Xuefu Zhang
Hi all,

As the event is around the corner. If you haven't responded, please RSVP at
meetup.com. Otherwise, I will see you next Wednesday, June 26.

Regards,
Xuefu

On Mon, Jun 10, 2019 at 7:50 PM Xuefu Zhang  wrote:

> Hi all,
>
> As promised, we planned to have quarterly Flink meetup and now it's about
> the time. Thus, I'm happy to announce that the next Bay Area Apache Flink
> Meetup [1] is scheduled on June 26 at Zendesk, 1019 Market Street in San
> Francisco.
>
> Schedule:
>
> 6:30 - 7:00 PM Networking and Refreshments
> 7:00 - 8:30 PM Short talks
>
> Many thanks go to Zendesk for their sponsorship. At the same time, we are
> open to 2 or 3 short talks. If interested, please let me know.
>
> Thanks,
>
> Xuefu
>
> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262216929
>>
>


Re: [ANNOUNCEMENT] March 2019 Bay Area Apache Flink Meetup

2019-06-17 Thread Xuefu Zhang
Hi all,

The scheduled meetup is only about a week away. Please note that RSVP at
meetup.com is required.  In order for us to get the actual headcount to
prepare for the event, please sign up as soon as possible if you plan to
join. Thank you very much for your cooperation.

Regards,
Xuefu

On Thu, Feb 14, 2019 at 4:32 PM Xuefu Zhang  wrote:

> Hi all,
>
> I'm very excited to announce that the community is planning the next
> meetup in Bay Area on March 25, 2019. The event is just announced on
> Meetup.com [1].
>
> To make the event successful, your participation and help will be needed.
> Currently, we are looking for an organization that can host the event.
> Please let me know if you have any leads.
>
> Secondly, we encourage Flink users and developers to take this as an
> opportunity to share experience or development. Thus, please let me know if
> you like to give a short talk.
>
> I look forward to meeting you all in the Meetup.
>
> Regards,
> Xuefu
>
> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465
>


[ANNOUNCEMENT] June 2019 Bay Area Apache Flink Meetup

2019-06-10 Thread Xuefu Zhang
Hi all,

As promised, we planned to have quarterly Flink meetup and now it's about
the time. Thus, I'm happy to announce that the next Bay Area Apache Flink
Meetup [1] is scheduled on June 26 at Zendesk, 1019 Market Street in San
Francisco.

Schedule:

6:30 - 7:00 PM Networking and Refreshments
7:00 - 8:30 PM Short talks

Many thanks go to Zendesk for their sponsorship. At the same time, we are
open to 2 or 3 short talks. If interested, please let me know.

Thanks,

Xuefu

[1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/262216929
>


[jira] [Created] (FLINK-12552) Combine HiveCatalog and GenericHiveMetastoreCatalog

2019-05-19 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12552:
---

 Summary: Combine HiveCatalog and GenericHiveMetastoreCatalog
 Key: FLINK-12552
 URL: https://issues.apache.org/jira/browse/FLINK-12552
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12469) Clean up catalog API on default/current DB

2019-05-09 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12469:
---

 Summary: Clean up catalog API on default/current DB
 Key: FLINK-12469
 URL: https://issues.apache.org/jira/browse/FLINK-12469
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Currently catalog API has get/setCurrentDatabase(), which is more user session 
specific. In our design principal, catalog instance is agnostic to user 
sessions. Thus, current database concept doesn't belong there. However, a 
catalog should support a (configurable) default database, which would be taken 
as user's current database when user's session doesn't specify a current DB.

This JIRA is to remove current database concept from catalog api and add 
default database instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12442) Add some more (negative) test cases for FLINK-12365

2019-05-07 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12442:
---

 Summary: Add some more (negative) test cases for FLINK-12365
 Key: FLINK-12442
 URL: https://issues.apache.org/jira/browse/FLINK-12442
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Xuefu Zhang


As a followup for FLINK-12365, add more, especially negative, test cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12441) Add column stats for decimal type

2019-05-07 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12441:
---

 Summary: Add column stats for decimal type
 Key: FLINK-12441
 URL: https://issues.apache.org/jira/browse/FLINK-12441
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


As a followup for FLINK-12365, add column stats type for decimal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12366) Clean up Catalog APIs to make them more consistent and coherent

2019-04-29 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12366:
---

 Summary: Clean up Catalog APIs to make them more consistent and 
coherent 
 Key: FLINK-12366
 URL: https://issues.apache.org/jira/browse/FLINK-12366
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Some of the APIs seem inconsistent with others in terms of exception thrown and 
error handling. This is to clean them up to maintain consistency and coherence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12365) Add stats related catalog APIs

2019-04-29 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12365:
---

 Summary: Add stats related catalog APIs
 Key: FLINK-12365
 URL: https://issues.apache.org/jira/browse/FLINK-12365
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.9.0


This is to support functions (UDFs) with related to catalog.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12255) Rename a few exception class names that were migrated from scala

2019-04-18 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-12255:
---

 Summary: Rename a few exception class names that were migrated 
from scala
 Key: FLINK-12255
 URL: https://issues.apache.org/jira/browse/FLINK-12255
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


As a followup of FLINK-11474 and based on PR review comments, a few exception 
(java) classes will be renamed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCEMENT] March 2019 Bay Area Apache Flink Meetup

2019-03-18 Thread Xuefu Zhang
Hi all,

This is a reminder that the next Bay Area Flink Meetup is about one week
away. Please RSVP if you plan to attend. Here is the list of the proposed
talks:

-- General community and project updates (Fabian Hueske)
-- Real-time experimentation and beyond with Flink at Pinterest (Parag
Kesar, Steven Bairos-Novak)
-- Flink now has a persistent metastore (Xuefu Zhang)

See you all next Monday!

Regards,

Xuefu Zhang

On Thu, Mar 7, 2019 at 5:08 PM Xuefu Zhang  wrote:

> Hi all,
>
> As an update, this meetup will take place at 505 Brannan St · San
> Francisco, CA
> <https://www.google.com/maps/search/?api=1=Pinterest%2C+505+Brannan+St%2C+San+Francisco%2C+CA%2C+94107%2C+us_place_id=ChIJb7GKw9V_j4ARC57mzICRckw>.
> Many thanks to Pinterest for their generosity of hosting the event.
>
> At the same time, I'd like to solicit for your help on the following:
>
> 1. RSVP at the meetup page if you're attending.
> 2. Submit your talk if you like to share.
> 3. Spread the word via all means.
>
> As the meetup will occur right at the time when the open source
> communities gather around Strata Data Conference
> <https://conferences.oreilly.com/strata/strata-ca> and Flink Forward SF
> <https://sf-2019.flink-forward.org/register>, this event is hoped to be a
> great opportunity to reach out and get involved.
>
> Your kind of assistance is greatly appreciated!
>
> Regards,
> Xuefu
>
>
>
> On Wed, Mar 6, 2019 at 2:09 PM Xuefu Zhang  wrote:
>
>> Hi all,
>>
>> This is a kind reminder that our next Flink meetup will be a couple of
>> weeks away. This is the opportunity to share experience or gain insights,
>> or just get socialized in the community.
>>
>> RSVP is required, which can be done at the meetup webpage
>> <https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465>.
>>
>> We are still finalizing the location. If you know any company could host,
>> please kindly let me know. Also, we can still accommodate a couple of
>> talks. Please also let me know if you present a talk.
>>
>> Thanks,
>> Xuefu
>>
>> On Thu, Feb 14, 2019 at 4:32 PM Xuefu Zhang  wrote:
>>
>>> Hi all,
>>>
>>> I'm very excited to announce that the community is planning the next
>>> meetup in Bay Area on March 25, 2019. The event is just announced on
>>> Meetup.com [1].
>>>
>>> To make the event successful, your participation and help will be
>>> needed. Currently, we are looking for an organization that can host the
>>> event. Please let me know if you have any leads.
>>>
>>> Secondly, we encourage Flink users and developers to take this as an
>>> opportunity to share experience or development. Thus, please let me know if
>>> you like to give a short talk.
>>>
>>> I look forward to meeting you all in the Meetup.
>>>
>>> Regards,
>>> Xuefu
>>>
>>> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465
>>>
>>


Re: [ANNOUNCEMENT] March 2019 Bay Area Apache Flink Meetup

2019-03-06 Thread Xuefu Zhang
Hi all,

This is a kind reminder that our next Flink meetup will be a couple of
weeks away. This is the opportunity to share experience or gain insights,
or just get socialized in the community.

RSVP is required, which can be done at the meetup webpage
<https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465>.

We are still finalizing the location. If you know any company could host,
please kindly let me know. Also, we can still accommodate a couple of
talks. Please also let me know if you present a talk.

Thanks,
Xuefu

On Thu, Feb 14, 2019 at 4:32 PM Xuefu Zhang  wrote:

> Hi all,
>
> I'm very excited to announce that the community is planning the next
> meetup in Bay Area on March 25, 2019. The event is just announced on
> Meetup.com [1].
>
> To make the event successful, your participation and help will be needed.
> Currently, we are looking for an organization that can host the event.
> Please let me know if you have any leads.
>
> Secondly, we encourage Flink users and developers to take this as an
> opportunity to share experience or development. Thus, please let me know if
> you like to give a short talk.
>
> I look forward to meeting you all in the Meetup.
>
> Regards,
> Xuefu
>
> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465
>


[ANNOUNCEMENT] March 2019 Bay Area Apache Flink Meetup

2019-02-14 Thread Xuefu Zhang
Hi all,

I'm very excited to announce that the community is planning the next meetup
in Bay Area on March 25, 2019. The event is just announced on Meetup.com
[1].

To make the event successful, your participation and help will be needed.
Currently, we are looking for an organization that can host the event.
Please let me know if you have any leads.

Secondly, we encourage Flink users and developers to take this as an
opportunity to share experience or development. Thus, please let me know if
you like to give a short talk.

I look forward to meeting you all in the Meetup.

Regards,
Xuefu

[1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465


[jira] [Created] (FLINK-11519) Add function related catalog APIs

2019-02-01 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11519:
---

 Summary: Add function related catalog APIs
 Key: FLINK-11519
 URL: https://issues.apache.org/jira/browse/FLINK-11519
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


This is to support functions (UDFs) with related to catalog.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11518) Add partition related catalog APIs

2019-02-01 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11518:
---

 Summary: Add partition related catalog APIs
 Key: FLINK-11518
 URL: https://issues.apache.org/jira/browse/FLINK-11518
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


To support partitions, we need to introduce additional APIs on top of 
FLINK-11474.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11482) Implement GenericHiveMetastoreCatalog

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11482:
---

 Summary: Implement GenericHiveMetastoreCatalog
 Key: FLINK-11482
 URL: https://issues.apache.org/jira/browse/FLINK-11482
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Bowen Li


{{GenericHiveMetastoreCatalog}} is a special implementation of 
{{ReadableWritableCatalog}} interface to store tables/views/functions defined 
in Flink to Hive metastore. With respect to the objects stored, 
{{GenericHiveMetastoreCatalog}} is similar to {{GenericInMemoryCatalog}}, but 
the storage used is a Hive metastore instead of memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11481) Integrate HiveTableFactory with existing table factory discovery mechanism

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11481:
---

 Summary: Integrate HiveTableFactory with existing table factory 
discovery mechanism
 Key: FLINK-11481
 URL: https://issues.apache.org/jira/browse/FLINK-11481
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Currently table factory is auto discovered based on table properties. However, 
for {{HiveTableFactory}}, the factory class for Hive tables, is specifically 
returned from {{HiveCatalog}}. Since we allow both mechanisms, we need to 
integrate the two so that they work seamlessly.

Please refer to the design doc for details. Some further design thoughts are 
necessary, however.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11480) Create HiveTableFactory that creates TableSource/Sink from a Hive table

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11480:
---

 Summary: Create HiveTableFactory that creates TableSource/Sink 
from a Hive table
 Key: FLINK-11480
 URL: https://issues.apache.org/jira/browse/FLINK-11480
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


This may requires some design thoughts because {{HiveTableFactory}} is 
different from existing {{TableFactory}} implementations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11479) Implement HiveCatalog

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11479:
---

 Summary: Implement HiveCatalog
 Key: FLINK-11479
 URL: https://issues.apache.org/jira/browse/FLINK-11479
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Bowen Li


{{HiveCatalog}} is an implementation of {{ReadableWritableCatalog}} interface 
for meta objects managed by Hive Metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11478) Handle existing table registration via YAML file in the context of catalog support

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11478:
---

 Summary: Handle existing table registration via YAML file in the 
context of catalog support
 Key: FLINK-11478
 URL: https://issues.apache.org/jira/browse/FLINK-11478
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client, Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Before we introduce Catalog, it's common for user to define his/her tables in 
SQL client YAML file.  With catalog support, it is no longer necessary. 
However, to keep backward compatibility, this mechanism should continue 
functioning. Behind the scene, we need to solve the problem of the same tables 
is registered every time user launches SQL client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11477) Define catalog entries in SQL client YAML file and handle the creation and registration of those entries

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11477:
---

 Summary: Define catalog entries in SQL client YAML file and handle 
the creation and registration of those entries
 Key: FLINK-11477
 URL: https://issues.apache.org/jira/browse/FLINK-11477
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client
Reporter: Xuefu Zhang
Assignee: Bowen Li


As the configuration for SQL client, the YAML file currently allows one to 
register tables along with other entities such as deployment, execution, and so 
on. However,  it doesn't have a section for catalog registration. We need to 
support user registering catalogs at sql client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11476) Create CatalogManager to manage multiple catalogs and encapsulate Calcite schema

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11476:
---

 Summary: Create CatalogManager to manage multiple catalogs and 
encapsulate Calcite schema
 Key: FLINK-11476
 URL: https://issues.apache.org/jira/browse/FLINK-11476
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Flink allows for more than one registered catalogs. {{CatalogManager}} class is 
the holding class to manage and encapsulate the catalogs and their 
interrelations with Calcite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11475) Adapt existing InMemoryExternalCatalog to GenericInMemoryCatalog

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11475:
---

 Summary: Adapt existing InMemoryExternalCatalog to 
GenericInMemoryCatalog
 Key: FLINK-11475
 URL: https://issues.apache.org/jira/browse/FLINK-11475
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Bowen Li


{{GenericInMemoryCatalog}} needs to implement ReadableWritableCatalog interface 
based on the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11474) Add ReadableCatalog, ReadableWritableCatalog, and other related interfaces

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11474:
---

 Summary: Add ReadableCatalog, ReadableWritableCatalog, and other 
related interfaces
 Key: FLINK-11474
 URL: https://issues.apache.org/jira/browse/FLINK-11474
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Also deprecate ReadableCatalog, ReadableWritableCatalog, and other related, 
existing classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11275) Unified Catalog APIs

2019-01-07 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11275:
---

 Summary: Unified Catalog APIs
 Key: FLINK-11275
 URL: https://issues.apache.org/jira/browse/FLINK-11275
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


During Flink-Hive integration, we found that the current Catalog APIs are quite 
cumbersome and to a large extent requiring significant rework. While previous 
APIs are essentially not used, at least not in Flink code base, we needs to be 
careful in defining a new set of APIs to avoid future rework again.

This is an uber JIRA covering all the work outlined in FLIP-30.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10760) Create a command line tool to migrate meta objects specified in SQL client configuration

2018-11-01 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10760:
---

 Summary: Create a command line tool to migrate meta objects 
specified in SQL client configuration
 Key: FLINK-10760
 URL: https://issues.apache.org/jira/browse/FLINK-10760
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


With a persistent catalog for Flink meta objects (tables, views, functions, 
etc), it becomes unnecessary to specify such objects in SQL client 
configuration (YAML) file. However, it would be helpful to the users, who 
already have some meta objects specified in the YARM file, to have a command 
line tool that migrates objects specified in YAML files to the persistent 
catalog once for all. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10759) Adapt SQL-client configuration file to specify external catalogs and default catalog

2018-11-01 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10759:
---

 Summary: Adapt SQL-client configuration file to specify external 
catalogs and default catalog
 Key: FLINK-10759
 URL: https://issues.apache.org/jira/browse/FLINK-10759
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


It doesn't seem that the configuration (YAML) file allows specifications of 
external catalogs currently. The request here is to add support for external 
catalog specifications in YAML file. User should also be able to specify one 
catalog is the default.

The catalog-related configurations then need to be processed and passed to 
{{TableEnvironment}} accordingly by calling relevant APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10758) Refactor TableEnvironment so that all registration calls delegate to CatalogManager

2018-11-01 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10758:
---

 Summary: Refactor TableEnvironment so that all registration calls 
delegate to CatalogManager 
 Key: FLINK-10758
 URL: https://issues.apache.org/jira/browse/FLINK-10758
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


There are many different APIs defined in {{TableEnvironment}} class that 
register tables/views/functions. Based on the design doc, those calls need to 
be delegated to {{CatalogManager}}. However, not all delegations are 
straightforward. For example. table registration could mean registering 
permanent tables, temp tables, or views. This JIRA takes care of the details. 

Please refer to the "TableEnvironment Class" section in the design doc 
(attached to the parent task) for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10744) Integrate Flink with Hive metastore

2018-10-31 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10744:
---

 Summary: Integrate Flink with Hive metastore 
 Key: FLINK-10744
 URL: https://issues.apache.org/jira/browse/FLINK-10744
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Affects Versions: 1.6.2
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


This JIRA keeps track of the effort of FLINK-10556 on Hive metastore 
integration. It mainly covers two aspects:
# Register Hive metastore as an external catalog of Flink, such that Hive table 
metadata can be accessed directly.
# Store Flink metadata (tables, views, UDFs, etc) in a catalog that utilizes 
Hive as the schema registry.
Discussions and resulting design doc will be shared here, but detailed work 
items will be tracked by sub-tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10729) Create a Hive a connector to access Hive data

2018-10-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10729:
---

 Summary: Create a Hive a connector to access Hive data
 Key: FLINK-10729
 URL: https://issues.apache.org/jira/browse/FLINK-10729
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Affects Versions: 1.6.2
Reporter: Xuefu Zhang


As part of Flink-Hive integration effort, it's important for Flink to access 
(read/write) Hive data, which is the responsibility of Hive connector. While 
there is a HCatalog data connector in the code base, it's not complete (i.e. 
missing all connector related classes such as validators, etc.). Further, 
HCatalog interface has many limitations such as accessing a subset of Hive 
data, supporting a subset of Hive data types, etc. In addition, it's not 
actively maintained. In fact, it's now only a sub-project in Hive.

Therefore, here we propose a complete connector set for Hive tables, not via 
HCatalog, but via direct Hive interface. HCatalog connector will be deprecated.

Please note that connector on Hive metadata is already covered in other JIRAs, 
as {{HiveExternalCatalog}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10699) Create a catalog implementation for persistent Flink meta objects using Hive metastore as a registry

2018-10-26 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10699:
---

 Summary: Create a catalog implementation for persistent Flink meta 
objects using Hive metastore as a registry
 Key: FLINK-10699
 URL: https://issues.apache.org/jira/browse/FLINK-10699
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Affects Versions: 1.6.1
Reporter: Xuefu Zhang


Similar to FLINK-10697, but using Hive metastore as persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10698) Create CatalogManager class manages all external catalogs and temporary meta objects

2018-10-26 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10698:
---

 Summary: Create CatalogManager class manages all external catalogs 
and temporary meta objects
 Key: FLINK-10698
 URL: https://issues.apache.org/jira/browse/FLINK-10698
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Affects Versions: 1.6.1
Reporter: Xuefu Zhang


Currently {{TableEnvironment}} manages a list of registered external catalogs 
as well as in-memory meta objects, and interacts with Calcite schema. It would 
be cleaner to delegate all those responsibilities to a dedicate class, 
especially when Flink's meta objects are also stored in a catalog.

{{CatalogManager}} is responsible to manage all meta objects, including 
external catalogs, temporary meta objects, and Calcite schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10697) Create an in-memory catalog that stores Flink's meta objects

2018-10-26 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10697:
---

 Summary: Create an in-memory catalog that stores Flink's meta 
objects
 Key: FLINK-10697
 URL: https://issues.apache.org/jira/browse/FLINK-10697
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Affects Versions: 1.6.1
Reporter: Xuefu Zhang


Currently all Flink meta objects (currently tables only) are stored in memory 
as part of Calcite catalog. Those objects are temporary (such as inline 
tables), others are meant to live beyond user session. As we introduce catalog 
for those objects (tables, views, and UDFs), it makes sense to organize them 
neatly. Further, having a catalog implementation that store those objects in 
memory is to retain the currently behavior, which can be configured by user.

Please note that this implementation is different from the current 
{{InMemoryExternalCatalog}, which is used mainly for testing and doesn't 
reflect what's actually needed for Flink meta objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10696) Add APIs to ExternalCatalog for views and UDFs

2018-10-26 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10696:
---

 Summary: Add APIs to ExternalCatalog for views and UDFs
 Key: FLINK-10696
 URL: https://issues.apache.org/jira/browse/FLINK-10696
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Affects Versions: 1.6.1
Reporter: Xuefu Zhang


Currently there are APIs for tables only. However, views and UDFs are also 
common objects in a catalog.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10618) Introduce catalog for Flink tables

2018-10-19 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10618:
---

 Summary: Introduce catalog for Flink tables
 Key: FLINK-10618
 URL: https://issues.apache.org/jira/browse/FLINK-10618
 Project: Flink
  Issue Type: New Feature
  Components: SQL Client
Affects Versions: 1.6.1
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Besides meta objects such as tables that may come from an {{ExternalCatalog}}, 
Flink also deals with tables/views/functions that are created on the fly (in 
memory), or specified in a configuration file. Those objects don't belong to 
any {{ExternalCatalog}}, yet Flink either stores them in memory, which are 
non-persistent, or recreates them from a file, which is a big pain for the 
user. Those objects are only known to Flink but Flink has a poor management for 
them.

Since they are typical objects in a database catalog, it's natural to have a 
catalog that manages those objects. The interface will be similar to 
{{ExternalCatalog}}, which contains meta objects that are not managed by Flink. 
There are several possible implementations of the Flink internal catalog 
interface: memory, file, external registry (such as confluent schema registry 
or Hive metastore), and relational database, etc. 

The initial functionality as well as the catalog hierarchy could be very 
simple. The basic functionality of the catalog will be mostly create, alter, 
and drop tables, views, functions, etc. Obviously, this can evolve over the 
time.

We plan to provide implementations with memory, file, and Hive metastore, and 
will be plugged in at SQL-Client layer.

Please provide your feedback.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10556) Integration with Apache Hive

2018-10-15 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10556:
---

 Summary: Integration with Apache Hive
 Key: FLINK-10556
 URL: https://issues.apache.org/jira/browse/FLINK-10556
 Project: Flink
  Issue Type: New Feature
  Components: Batch Connectors and Input/Output Formats, SQL Client, 
Table API  SQL
Affects Versions: 1.6.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


This is an umbrella JIRA tracking all enhancement and issues related to 
integrating Flink with Hive ecosystem. This is an outcome of a discussion in 
the community, and thanks go to everyone that provided feedback and interest.

Specifically, we'd like to see the following features and capabilities 
immediately in Flink:
# Detadata interoperability
# Data interoperability
# Data type compatibility
# Hive UDF support
# DDL/DML/Query language compatibility

For a longer term, we'd also like to add or improve:
# Compatible SQL service, client tools, JDBC/ODBC drivers
# Better task failure tolerance and task scheduling
# Support other user customizations in Hive (storage handlers, serdes, etc).

I will provide more details regarding the proposal in a doc shortly. Design 
doc, if deemed necessary, will be provided in each related sub tasks under this 
JIRA.

Feedback and contributions are greatly welcome!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10542) Register Hive metastore as an external catalog in TableEnvironment

2018-10-12 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10542:
---

 Summary: Register Hive metastore as an external catalog in 
TableEnvironment
 Key: FLINK-10542
 URL: https://issues.apache.org/jira/browse/FLINK-10542
 Project: Flink
  Issue Type: New Feature
  Components: Table API  SQL
Affects Versions: 1.6.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Similar to FLINK-2167 but rather register Hive metastore as an external ctalog 
in the {{TableEnvironment}}. After registration, Table API and SQL queries 
should be able to access all Hive tables.

This might supersede the need of FLINK-2167 because Hive metastore stores a 
superset of tables available via hCat without an indirection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)