A few additional observations about StarRocks...

- As far as I can tell, StarRocks has an ASF incompatible license (Elastic
License 2.0).
- It appears to be a hard fork of Apache Doris, a project still in the
incubator (and looks like it probably is destructive to the Doris project)
- The project has only existed for ~2 months.





On Sun, Nov 7, 2021 at 7:34 PM OpenInx <open...@gmail.com> wrote:

> Any thoughts for adding StarRocks integration to the roadmap ?
>
> I think the guys from StarRocks community can provide more background and
> inputs.
>
> On Thu, Nov 4, 2021 at 5:59 PM OpenInx <open...@gmail.com> wrote:
>
>> Update:
>>
>> StarRocks[1] is a next-gen sub-second MPP database for full analysis
>> scenarios, including multi-dimensional analytics, real-time analytics and
>> ad-hoc query.  Their team is planning to integrate iceberg tables as
>> StarRocks external tables in the next month [2], so that people could
>> connect the data lake and StarRocks warehouse in the same engine.
>> The excellent performance of StarRocks will also help accelerate the
>> analysis and access of the iceberg table, I think this is a great thing for
>> both the iceberg community and the StarRocks community.   I think we can
>> add an extra project about StarRocks integration work in the apache iceberg
>> roadmap [3] ?
>>
>> [1].  https://github.com/StarRocks/starrocks
>> [2].  https://github.com/StarRocks/starrocks/issues/1030
>> [3].  https://github.com/apache/iceberg/projects
>>
>> On Mon, Nov 1, 2021 at 11:52 PM Ryan Blue <b...@tabular.io> wrote:
>>
>>> I closed the upgrade project and marked the FLIP-27 project priority 1.
>>> Thanks for all the work to get this done!
>>>
>>> On Sun, Oct 31, 2021 at 8:10 PM OpenInx <open...@gmail.com> wrote:
>>>
>>>> Update:
>>>>
>>>> I think the project  [Flink: Upgrade to 1.13.2][1] in RoadMap can be
>>>> closed now, because all of the issues have been addressed.
>>>>
>>>> [1]. https://github.com/apache/iceberg/projects/12
>>>>
>>>> On Tue, Sep 21, 2021 at 6:17 PM Eduard Tudenhoefner <edu...@dremio.com>
>>>> wrote:
>>>>
>>>>> I created a Roadmap section in
>>>>>  https://github.com/apache/iceberg/pull/3163
>>>>> <https://github.com/apache/iceberg/pull/3163> that links to the
>>>>> planning boards that Jack created. I figured it makes sense if we link
>>>>> available Design Docs directly on those Boards (as was already done),
>>>>> because then the Design docs are closer to the set of related issues.
>>>>>
>>>>> On Mon, Sep 20, 2021 at 10:02 PM Ryan Blue <b...@tabular.io> wrote:
>>>>>
>>>>>> Thanks, Jack!
>>>>>>
>>>>>> Eduard, I think that's a good idea. We should have a roadmap page as
>>>>>> well that links to the projects that Jack just created.
>>>>>>
>>>>>> On Mon, Sep 20, 2021 at 12:57 PM Jack Ye <yezhao...@gmail.com> wrote:
>>>>>>
>>>>>>> It seems like we have reached some consensus around the projects
>>>>>>> listed here. I have created corresponding Github projects for each:
>>>>>>> https://github.com/apache/iceberg/projects
>>>>>>>
>>>>>>> Related design docs are also linked there.
>>>>>>>
>>>>>>> Best,
>>>>>>> Jack Ye
>>>>>>>
>>>>>>> On Sun, Sep 19, 2021 at 11:18 PM Eduard Tudenhoefner <
>>>>>>> edu...@dremio.com> wrote:
>>>>>>>
>>>>>>>> Would it make sense to have a section on the website where we
>>>>>>>> collect all the links to the design docs/specs as that would be easier 
>>>>>>>> to
>>>>>>>> find than searching for things on the ML?
>>>>>>>>
>>>>>>>> I was thinking about something like for each component:
>>>>>>>> * link to the ML discussion
>>>>>>>> * link to the actual Spec/Design Doc
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> On Fri, Sep 10, 2021 at 11:38 PM Ryan Blue <b...@tabular.io> wrote:
>>>>>>>>
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>> At the last sync meeting, we brought up publishing a community
>>>>>>>>> roadmap and brainstormed the many features and initiatives that the
>>>>>>>>> community is working on. In this thread, I want to make sure that we 
>>>>>>>>> have a
>>>>>>>>> good list of what people are thinking about and I think we should try 
>>>>>>>>> to
>>>>>>>>> categorize the projects by size and general priority. When we reach a 
>>>>>>>>> rough
>>>>>>>>> agreement, I’ll write this up and post it on the ASF site along with 
>>>>>>>>> links
>>>>>>>>> to some projects in Github.
>>>>>>>>>
>>>>>>>>> My rationale for attempting to prioritize projects is that if we
>>>>>>>>> try to do too many things, it will be slower progress across 
>>>>>>>>> everything
>>>>>>>>> rather than getting a few important items done. I know that priorities
>>>>>>>>> don’t align very cleanly in practice, but it is hopefully worth 
>>>>>>>>> trying. To
>>>>>>>>> come up with a priority, I’m trying to keep top priority items to a 
>>>>>>>>> minimum
>>>>>>>>> by including only one from each group (Spark, Flink, Python, etc.). 
>>>>>>>>> The
>>>>>>>>> remaining items are split between priority 2 and 3. Priority 3 is not
>>>>>>>>> urgent, including things that can be plugged in (like other IO 
>>>>>>>>> libraries),
>>>>>>>>> docs, etc. Everything else is priority 2.
>>>>>>>>>
>>>>>>>>> That something isn’t priority 1 doesn’t mean it isn’t important or
>>>>>>>>> progressing, just that it isn’t the current focus. I think of it this 
>>>>>>>>> way:
>>>>>>>>> if someone has extra time to review something, what should be next? 
>>>>>>>>> That’s
>>>>>>>>> top priority.
>>>>>>>>>
>>>>>>>>> Here’s my rough categorization. If you disagree, please speak up:
>>>>>>>>>
>>>>>>>>>    - If you think that something should be top priority, what
>>>>>>>>>    gets moved to priority 2?
>>>>>>>>>    - Should the priority for a project in 2 or 3 change?
>>>>>>>>>    - Is the S/M/L size of a project wrong?
>>>>>>>>>
>>>>>>>>> Top priority, 1:
>>>>>>>>>
>>>>>>>>>    - API: Iceberg 1.0 [medium]
>>>>>>>>>    - Spark: Merge-on-read plans [large]
>>>>>>>>>    - Maintenance: Delete file compaction [medium]
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    Flink: Upgrade to 1.13.2 (document compatibility) [medium]
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    Python: Pythonic refactor [medium]
>>>>>>>>>
>>>>>>>>> Priority 2:
>>>>>>>>>
>>>>>>>>>    - ORC: Support delete files stored as ORC [small]
>>>>>>>>>    - Spark: DSv2 streaming improvements [small]
>>>>>>>>>    - Flink: Inline file compaction [small]
>>>>>>>>>    - Flink: Support UPSERT [small]
>>>>>>>>>    - Views: Spec [medium]
>>>>>>>>>    - Spec: Z-ordering / Space-filling curves [medium]
>>>>>>>>>    - Spec: Snapshot tagging and branching [small]
>>>>>>>>>    - Spec: Secondary indexes [large]
>>>>>>>>>    - Spec v3: Encryption [large]
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    Spec v3: Relative paths [large]
>>>>>>>>>    -
>>>>>>>>>
>>>>>>>>>    Spec v3: Default field values [medium]
>>>>>>>>>
>>>>>>>>> Priority 3:
>>>>>>>>>
>>>>>>>>>    - Docs: versioned docs [medium]
>>>>>>>>>    - IO: Support Aliyun OSS/DLF [medium]
>>>>>>>>>    - IO: Support Dell ECS [medium]
>>>>>>>>>
>>>>>>>>> External:
>>>>>>>>>
>>>>>>>>>    - Trino: Bucketed joins [small]
>>>>>>>>>    - Trino: Row-level delete support [medium]
>>>>>>>>>    - Trino: Merge-on-read plans [medium]
>>>>>>>>>    - Trino: Multi-catalog support [small]
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ryan Blue
>>>>>>>>> Tabular
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ryan Blue
>>>>>> Tabular
>>>>>>
>>>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>

Reply via email to