" It would be simply to gain full functionality of Hive" . That should read
Iceberg.

On Wed, Jan 29, 2020 at 12:55 PM Kristopher Kane <kk...@etsy.com> wrote:

> Adrian, "I'd imagine that keeping binary compatibility across Hive, Spark
> and Iceberg will be quite a challenge."  Yeah, this is what I'm afraid of
> over time.  Iceberg's big draw for me is only maintaining a processing
> engine (Spark), Iceberg and cloud storage compatibility and any potential
> Iceberg use wouldn't even be with the rest of the Hive ecosystem. It would
> be simply to gain full functionality of Hive via a ready-to-use metastore
> which, right now, defaults to Hive.  Hive 3, with Ranger and Atlas and
> Ranger based security, take things even further away for Spark as it is not
> allowing interaction with Hive intrinsic services like the metastore
> anyway.  It might be that you can run the Hive 3 metastore for now but the
> paths forward don't suggest that is accessible for much into the future.
>
> Ryan, when you said, "I'd really love to see a new metastore project," did
> you mean internal to the Iceberg project?
>
> Kris
>
> On Wed, Jan 29, 2020 at 12:17 PM Mass Dosage <massdos...@gmail.com> wrote:
>
>> On the topic of Hive versions - we've definitely experienced some issues
>> trying to programmatically use the iceberg-spark-runtime artifact in unit
>> tests (it uses Hive 1.2 as mentioned above). We then tried to also use some
>> other common HIve testing libraries like HiveRunner
>> <https://github.com/klarna/HiveRunner/> and BeeJU
>> <https://github.com/HotelsDotCom/beeju> which in turn use Hive 2.3. We
>> then ended up with exceptions (e.g. "Method not found") due to
>> incompatibilities between the Hive library classes and had to abandon the
>> testing libraries. I can share these exceptions if that would be useful but
>> I'd imagine that keeping binary compatibility across Hive, Spark and
>> Iceberg will be quite a challenge. I'd prefer Iceberg defaulting to Hive
>> 2.3.x over 1.2 as 1.2 is pretty old, I don't think any of the commercial
>> Hadoop vendors officially support it any more and I think it's used a lot
>> less now than 2.x but I could be wrong. Alternatively a way to pick and
>> choose a Hive version would be great but probably quite a bit of work to
>> pull off...
>>
>> Adrian
>>
>> On Wed, 29 Jan 2020 at 16:59, Ryan Blue <rb...@netflix.com.invalid>
>> wrote:
>>
>>> Hi Kris,
>>>
>>> We use version 1.2.1 because the part that we're using hasn't changed
>>> much and we want to ensure compatibility with old metastore versions.
>>> Iceberg should work with newer metastores, and feel free to open a bug if
>>> you find a problem with one. We'll make sure to fix it to be compatible
>>> with a range of versions.
>>>
>>> I'm not sure what people are going to want eventually. Right now, we
>>> know that many people use the Hive metastore to track tables, so it makes
>>> sense to support it as an option. Iceberg allows you to plug in your own
>>> metastore easily because we know that lots of places (Netflix included)
>>> have their own metastore implementations. I'd really love to see a new
>>> metastore project, but I don't think that Iceberg should be opinionated
>>> about which one you use.
>>>
>>> rb
>>>
>>> On Wed, Jan 29, 2020 at 7:32 AM Kristopher Kane <kk...@etsy.com.invalid>
>>> wrote:
>>>
>>>> Hi Iceberg.
>>>>
>>>> It looks like for most cases where non-atomic rename is required, using
>>>> the Hive metastore is the baseline with the ability to implement a custom.
>>>>
>>>> I couldn't find mailing list history or GitHub issue that suggests that
>>>> Iceberg will implement its own. Is that intended for the future?
>>>>
>>>> I ask because Iceberg's metastore version pin is 1.2.1 which is very
>>>> old.  Someone using Iceberg, with a Hive metastore, mind find difficult
>>>> moving maintaining peace in upgrades with Hive.
>>>>
>>>> Related:  Is the intention here that existing Hive users would use the
>>>> store that they have and new Iceberg users would implement custom?
>>>>
>>>> Appreciate help in understanding,
>>>>
>>>> Kris
>>>>
>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>

Reply via email to