Re: [DISCUSS] Iceberg roadmap

Reo Lei Sun, 07 Nov 2021 21:24:06 -0800

+1, I have the same concern for the incompatible license.

Jacques Nadeau <jacquesnad...@gmail.com> 于2021年11月8日周一 上午11:48写道：


> A few additional observations about StarRocks...
>
> - As far as I can tell, StarRocks has an ASF incompatible license (Elastic
> License 2.0).
> - It appears to be a hard fork of Apache Doris, a project still in the
> incubator (and looks like it probably is destructive to the Doris project)
> - The project has only existed for ~2 months.
>
>
>
>
>
> On Sun, Nov 7, 2021 at 7:34 PM OpenInx <open...@gmail.com> wrote:
>
>> Any thoughts for adding StarRocks integration to the roadmap ?
>>
>> I think the guys from StarRocks community can provide more background and
>> inputs.
>>
>> On Thu, Nov 4, 2021 at 5:59 PM OpenInx <open...@gmail.com> wrote:
>>
>>> Update:
>>>
>>> StarRocks[1] is a next-gen sub-second MPP database for full analysis
>>> scenarios, including multi-dimensional analytics, real-time analytics and
>>> ad-hoc query.  Their team is planning to integrate iceberg tables as
>>> StarRocks external tables in the next month [2], so that people could
>>> connect the data lake and StarRocks warehouse in the same engine.
>>> The excellent performance of StarRocks will also help accelerate the
>>> analysis and access of the iceberg table, I think this is a great thing for
>>> both the iceberg community and the StarRocks community.   I think we can
>>> add an extra project about StarRocks integration work in the apache iceberg
>>> roadmap [3] ?
>>>
>>> [1].  https://github.com/StarRocks/starrocks
>>> [2].  https://github.com/StarRocks/starrocks/issues/1030
>>> [3].  https://github.com/apache/iceberg/projects
>>>
>>> On Mon, Nov 1, 2021 at 11:52 PM Ryan Blue <b...@tabular.io> wrote:
>>>
>>>> I closed the upgrade project and marked the FLIP-27 project priority 1.
>>>> Thanks for all the work to get this done!
>>>>
>>>> On Sun, Oct 31, 2021 at 8:10 PM OpenInx <open...@gmail.com> wrote:
>>>>
>>>>> Update:
>>>>>
>>>>> I think the project  [Flink: Upgrade to 1.13.2][1] in RoadMap can be
>>>>> closed now, because all of the issues have been addressed.
>>>>>
>>>>> [1]. https://github.com/apache/iceberg/projects/12
>>>>>
>>>>> On Tue, Sep 21, 2021 at 6:17 PM Eduard Tudenhoefner <edu...@dremio.com>
>>>>> wrote:
>>>>>
>>>>>> I created a Roadmap section in
>>>>>>  https://github.com/apache/iceberg/pull/3163
>>>>>> <https://github.com/apache/iceberg/pull/3163> that links to the
>>>>>> planning boards that Jack created. I figured it makes sense if we link
>>>>>> available Design Docs directly on those Boards (as was already done),
>>>>>> because then the Design docs are closer to the set of related issues.
>>>>>>
>>>>>> On Mon, Sep 20, 2021 at 10:02 PM Ryan Blue <b...@tabular.io> wrote:
>>>>>>
>>>>>>> Thanks, Jack!
>>>>>>>
>>>>>>> Eduard, I think that's a good idea. We should have a roadmap page as
>>>>>>> well that links to the projects that Jack just created.
>>>>>>>
>>>>>>> On Mon, Sep 20, 2021 at 12:57 PM Jack Ye <yezhao...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It seems like we have reached some consensus around the projects
>>>>>>>> listed here. I have created corresponding Github projects for each:
>>>>>>>> https://github.com/apache/iceberg/projects
>>>>>>>>
>>>>>>>> Related design docs are also linked there.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jack Ye
>>>>>>>>
>>>>>>>> On Sun, Sep 19, 2021 at 11:18 PM Eduard Tudenhoefner <
>>>>>>>> edu...@dremio.com> wrote:
>>>>>>>>
>>>>>>>>> Would it make sense to have a section on the website where we
>>>>>>>>> collect all the links to the design docs/specs as that would be 
>>>>>>>>> easier to
>>>>>>>>> find than searching for things on the ML?
>>>>>>>>>
>>>>>>>>> I was thinking about something like for each component:
>>>>>>>>> * link to the ML discussion
>>>>>>>>> * link to the actual Spec/Design Doc
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>> On Fri, Sep 10, 2021 at 11:38 PM Ryan Blue <b...@tabular.io>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> At the last sync meeting, we brought up publishing a community
>>>>>>>>>> roadmap and brainstormed the many features and initiatives that the
>>>>>>>>>> community is working on. In this thread, I want to make sure that we 
>>>>>>>>>> have a
>>>>>>>>>> good list of what people are thinking about and I think we should 
>>>>>>>>>> try to
>>>>>>>>>> categorize the projects by size and general priority. When we reach 
>>>>>>>>>> a rough
>>>>>>>>>> agreement, I’ll write this up and post it on the ASF site along with 
>>>>>>>>>> links
>>>>>>>>>> to some projects in Github.
>>>>>>>>>>
>>>>>>>>>> My rationale for attempting to prioritize projects is that if we
>>>>>>>>>> try to do too many things, it will be slower progress across 
>>>>>>>>>> everything
>>>>>>>>>> rather than getting a few important items done. I know that 
>>>>>>>>>> priorities
>>>>>>>>>> don’t align very cleanly in practice, but it is hopefully worth 
>>>>>>>>>> trying. To
>>>>>>>>>> come up with a priority, I’m trying to keep top priority items to a 
>>>>>>>>>> minimum
>>>>>>>>>> by including only one from each group (Spark, Flink, Python, etc.). 
>>>>>>>>>> The
>>>>>>>>>> remaining items are split between priority 2 and 3. Priority 3 is not
>>>>>>>>>> urgent, including things that can be plugged in (like other IO 
>>>>>>>>>> libraries),
>>>>>>>>>> docs, etc. Everything else is priority 2.
>>>>>>>>>>
>>>>>>>>>> That something isn’t priority 1 doesn’t mean it isn’t important
>>>>>>>>>> or progressing, just that it isn’t the current focus. I think of it 
>>>>>>>>>> this
>>>>>>>>>> way: if someone has extra time to review something, what should be 
>>>>>>>>>> next?
>>>>>>>>>> That’s top priority.
>>>>>>>>>>
>>>>>>>>>> Here’s my rough categorization. If you disagree, please speak up:
>>>>>>>>>>
>>>>>>>>>>    - If you think that something should be top priority, what
>>>>>>>>>>    gets moved to priority 2?
>>>>>>>>>>    - Should the priority for a project in 2 or 3 change?
>>>>>>>>>>    - Is the S/M/L size of a project wrong?
>>>>>>>>>>
>>>>>>>>>> Top priority, 1:
>>>>>>>>>>
>>>>>>>>>>    - API: Iceberg 1.0 [medium]
>>>>>>>>>>    - Spark: Merge-on-read plans [large]
>>>>>>>>>>    - Maintenance: Delete file compaction [medium]
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    Flink: Upgrade to 1.13.2 (document compatibility) [medium]
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    Python: Pythonic refactor [medium]
>>>>>>>>>>
>>>>>>>>>> Priority 2:
>>>>>>>>>>
>>>>>>>>>>    - ORC: Support delete files stored as ORC [small]
>>>>>>>>>>    - Spark: DSv2 streaming improvements [small]
>>>>>>>>>>    - Flink: Inline file compaction [small]
>>>>>>>>>>    - Flink: Support UPSERT [small]
>>>>>>>>>>    - Views: Spec [medium]
>>>>>>>>>>    - Spec: Z-ordering / Space-filling curves [medium]
>>>>>>>>>>    - Spec: Snapshot tagging and branching [small]
>>>>>>>>>>    - Spec: Secondary indexes [large]
>>>>>>>>>>    - Spec v3: Encryption [large]
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    Spec v3: Relative paths [large]
>>>>>>>>>>    -
>>>>>>>>>>
>>>>>>>>>>    Spec v3: Default field values [medium]
>>>>>>>>>>
>>>>>>>>>> Priority 3:
>>>>>>>>>>
>>>>>>>>>>    - Docs: versioned docs [medium]
>>>>>>>>>>    - IO: Support Aliyun OSS/DLF [medium]
>>>>>>>>>>    - IO: Support Dell ECS [medium]
>>>>>>>>>>
>>>>>>>>>> External:
>>>>>>>>>>
>>>>>>>>>>    - Trino: Bucketed joins [small]
>>>>>>>>>>    - Trino: Row-level delete support [medium]
>>>>>>>>>>    - Trino: Merge-on-read plans [medium]
>>>>>>>>>>    - Trino: Multi-catalog support [small]
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ryan Blue
>>>>>>>>>> Tabular
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>> Tabular
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Tabular
>>>>
>>>

Re: [DISCUSS] Iceberg roadmap

Reply via email to