Hey Iceberg Community, Sorry, for being late to this conversation. I just wanted to share that I'm against deprecating HadoopCatalog or moving it to tests. Currently Impala relies heavily on HadoopCatalog for it's own tests and I personally find HadoopCatalog pretty handy when I just want to do some cross-engine experiments where my data is already on HDFS and I just write a table with engineA and see if engineB can read it and I don't want to bother with setting up any services to serve as an Iceberg catalog (HMS for instance).
I believe that even though we don't consider HadoopCatalog a production grade solution as it is now, it has its benefits for lightweight experimentation. - I'm +1 for keeping HadoopCatalog - We should emphasize that HDFS is the desired storage for HadoopCatalog (can we force this in the code?) - Apparently, there is a part of this community that is open to add enhancements to HadoopCatalog to bring it closer to production gradeness (lisoda). I don't think we shouldn't block these contributions. - If we say that REST Catalog is preferred over HadoopCatalog I think the Iceberg project should offer its own open-source solution available for everyone. Regards, Gabor On Thu, Jul 25, 2024 at 9:04 PM Ryan Blue <b...@databricks.com.invalid> wrote: > There are ways to use object store or file system features to do this, but > there are a lot of variations. Building implementations and trying to > standardize each one is a lot of work. And then you still get a catalog > that doesn't support important features. > > I don't think that this is a good direction to build for the Iceberg > project. But I also have no objection to someone doing it in a different > project that uses the Iceberg metadata format. > > On Tue, Jul 23, 2024 at 5:57 PM lisoda <lis...@yeah.net> wrote: > >> >> Sir, regarding this point, we have some experience. In my view, as long >> as the file system supports atomic single-file writing, where the file >> becomes immediately visible upon the client's successful write operation, >> that is sufficient. We can do without the rename operation as long as the >> file system guarantees this feature. Of course, if the object storage >> system supports mutex operations, we can also uniformly use the rename >> operation for committing. We can theoretically avoid the situation of >> providing a large number of commit strategies for different file systems. >> ---- Replied Message ---- >> From Jack Ye<yezhao...@gmail.com> <yezhao...@gmail.com> >> Date 07/24/2024 02:52 >> To dev@iceberg.apache.org >> Cc >> Subject Re: Re: [DISCUSS] Deprecate HadoopTableOperations, move to tests >> in 2.0 >> If we come up with a new storage-only catalog implementation that could >> solve those limitations and also leverage the new features being developed >> in object storage, would that be a potential alternative strategy? so >> HadoopCatalog users has a way to move forward with still a storage-only >> catalog that can run on HDFS, and we can fully deprecate HadoopCatalog. >> >> -Jack >> >> On Tue, Jul 23, 2024 at 10:00 AM Ryan Blue <b...@databricks.com.invalid> >> wrote: >> >>> I don't think we would want to put this in a module with other catalog >>> implementations. It has serious limitations and is actively discouraged, >>> while the other catalog implementations still have value as either REST >>> back-end catalogs or as regular catalogs for many users. >>> >>> On Tue, Jul 23, 2024 at 9:11 AM Jack Ye <yezhao...@gmail.com> wrote: >>> >>>> For some additional information, we also have some Iceberg HDFS users >>>> on EMR. Those are mainly users that have long-running Hadoop and HBase >>>> installations. They typically refresh their installation every 1-2 years. >>>> From my understanding, they use S3 for data storage, but metadata is kept >>>> in the local HDFS cluster, thus HadoopCatalog works well for them. >>>> >>>> I remember we discussed moving all catalog implementations in the main >>>> repo right now to a separated iceberg-catalogs repo. Could we do this move >>>> as a part of that effort? >>>> >>>> -Jack >>>> >>>> On Tue, Jul 23, 2024 at 8:46 AM Ryan Blue <b...@databricks.com.invalid> >>>> wrote: >>>> >>>>> Thanks for the context, lisoda. I agree that it's good to understand >>>>> the issues you're facing with the HadoopCatalog. One follow up question >>>>> that I have is what the underlying storage is. Are you using HDFS for >>>>> those >>>>> 30,000 customers? >>>>> >>>>> I think you're right that there is a challenge to migrating. Because >>>>> there is no catalog requirement, it's hard to make sure you have all of >>>>> the >>>>> writers migrated. I think that means we do need to have a plan or >>>>> recommendation for people currently using this catalog in production, but >>>>> it also puts more pressure on us to deprecate this catalog and avoid more >>>>> people having this problem. >>>>> >>>>> I think it's a good idea to make the spec change, which we have >>>>> agreement for and to ensure that the FS catalog and table operations are >>>>> properly deprecated to show that they should not be used. I'm not sure >>>>> whether there is support in the community for moving the implementation >>>>> into a new iceberg-hadoop module, but at a minimum we can't just remove >>>>> this right away. I think that a separate iceberg-hadoop module would make >>>>> the most sense. >>>>> >>>>> On Thu, Jul 18, 2024 at 11:09 PM lisoda <lis...@yeah.net> wrote: >>>>> >>>>>> Hi team. >>>>>> I am not a pmc member, just a regular user. Instead of >>>>>> discussing whether hadoopcatalog needs to continue to exist, I'd like to >>>>>> share a more practical issue. >>>>>> >>>>>> We currently serve over 30,000 customers, all of whom use Iceberg >>>>>> to store their foundational data, and all business analyses are conducted >>>>>> based on Iceberg. However, all the Iceberg tables are hadoop_catalog. At >>>>>> least, this has been the case since I started working with our production >>>>>> environment system. >>>>>> >>>>>> In recent days, I've attempted to migrate hadoop_catalog to >>>>>> jdbc-catalog, but I failed. We store 2PB of data, and replacing the >>>>>> current >>>>>> catalogues has become an almost impossible task. Users not only create >>>>>> hadoop_catalog tables through Spark, they also continuously use >>>>>> third-party >>>>>> OLAP systems/FLINK and other means to write data into Iceberg in the form >>>>>> of hadoop_catalog. Given this situation, we can only continue to fix >>>>>> hadoop_catalog and provide services to customers. >>>>>> >>>>>> I understand that the community wants to make a big push into >>>>>> rest-catalog, and I agree with the direction the community is going.But >>>>>> considering >>>>>> that there might be a significant number of users facing similar issues, >>>>>> can we at least retain a module similar to iceberg-hadoop to extend >>>>>> hadoop_catalog? If it is removed, we won't be able to continue providing >>>>>> services to customers. So, if possible, please consider this option. >>>>>> >>>>>> Thank you all. >>>>>> >>>>>> Kind regards, >>>>>> lisoda >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> At 2024-07-19 01:28:18, "Jack Ye" <yezhao...@gmail.com> wrote: >>>>>> >>>>>> Thank you for bringing this up Ryan. I have been also in the camp of >>>>>> saying HadoopCatalog is not recommended, but after thinking about this >>>>>> more >>>>>> deeply last night, I now have mixed feelings about this topic. Just to >>>>>> comment on the reasons you listed first: >>>>>> >>>>>> * For reason 1 & 2, it looks like the root cause is that people try >>>>>> to use HadoopCatalog outside native HDFS because there are HDFS >>>>>> connectors >>>>>> to other storages like S3AFileSystem. However, the norm for such usage >>>>>> has >>>>>> been that those connectors do not strictly follow HDFS semantics, and it >>>>>> is >>>>>> assumed that people acknowledge the implication of such usage and accept >>>>>> the risk. For example, S3AFileSystem was there even before S3 was >>>>>> strongly >>>>>> consistent, but people have been using that to write files. >>>>>> >>>>>> * For reason 3, there are multiple catalogs that do not support all >>>>>> operations (e.g. Glue for atomic table rename) and people still widely >>>>>> use >>>>>> it. >>>>>> >>>>>> * For reason 4, I see that more as a missing feature. More features >>>>>> could definitely be developed in that catalog implementation. >>>>>> >>>>>> So the key question to me is, how can we prevent people from using >>>>>> HadoopCatalog outside native HDFS. We know HadoopCatalog is popular >>>>>> because >>>>>> it is a storage only solution. For object storages specifically, >>>>>> HadoopCatalog is not suitable for 2 reasons: >>>>>> >>>>>> (1) file write does not enforce mutual exclusion, thus cannot enforce >>>>>> Iceberg optimistic concurrency requirement (a.k.a. cannot do atomic and >>>>>> swap) >>>>>> >>>>>> (2) directory-based design is not preferred in object storage and >>>>>> will result in bad performance. >>>>>> >>>>>> However, now I look at these 2 issues, they are getting outdated. >>>>>> >>>>>> (1) object storage is starting to enforce file mutual exclusion. GCS >>>>>> supports file generation number [1] that increments monotonically, and >>>>>> can >>>>>> use x-goog-if-generation-match [2] to perform atomic swap. Similar >>>>>> feature >>>>>> [3] exists in Azure Blob Storage. I cannot speak for the S3 team roadmap. >>>>>> But Amazon S3 is clearly falling behind in this domain, and with market >>>>>> competition, it is very clear that similar features will come in >>>>>> reasonably >>>>>> near future. >>>>>> >>>>>> (2) directory bucket is becoming the norm. Amazon S3 announced >>>>>> directory bucket in 2023 re:invent [4], which does not have the same >>>>>> performance limitation even if you have very nested folders and many >>>>>> objects in a folder. GCS also has a similar feature launched in preview >>>>>> [5] >>>>>> right now. Azure also already has this feature since 2021 [6]. >>>>>> >>>>>> With these new developments in the industry, a storage-only Iceberg >>>>>> catalog becomes very attractive. It is simple with only one service >>>>>> dependency. It can safely perform atomic compare-and-swap. It is >>>>>> performant >>>>>> without the need to worry about folder and file organization. If you want >>>>>> to add additional features for things like access control, there are also >>>>>> integrations like access grant [7] that can be integrated to do it in a >>>>>> very scalable way. >>>>>> >>>>>> I know the direction in the community so far is to go with the REST >>>>>> catalog, and I am personally a big advocate for that. However, that >>>>>> requires either building a full REST catalog, or choosing a catalog >>>>>> vendor >>>>>> that supports REST. There are many capabilities that REST would unlock, >>>>>> but >>>>>> those are visions which I expect will take many years down the road for >>>>>> the >>>>>> community to continue to drive consensus and build those features. If I >>>>>> am >>>>>> the CTO of a small company and I just want an Iceberg data lake(house) >>>>>> right now, do I choose REST, or do I choose (or even just build) a >>>>>> storage-only Iceberg catalog? I feel I would actually choose the later. >>>>>> >>>>>> Going back to the discussion points, my current take of this topic is >>>>>> that: >>>>>> >>>>>> (1) +1 for clarifying that HadoopCatalog should only work with HDFS >>>>>> in the spec. >>>>>> >>>>>> (2) +1 if we want to block non-HDFS use cases in HadoopCatalog by >>>>>> default (e.g. fail if using S3A), but we should allow a feature flag to >>>>>> unblock the usage so that people can use it after understanding the >>>>>> implications and risks, just like how people use S3A today. >>>>>> >>>>>> (3) +0 for removing HadoopCatalog from the core library. It could be >>>>>> in a different module like iceberg-hdfs if that is more suitable. >>>>>> >>>>>> (4) -1 for moving HadoopCatalog to tests, because HDFS is still a >>>>>> valid use case for Iceberg. After the measures 1-3 above, people actually >>>>>> having a HDFS use case should be able to continue to innovate and >>>>>> optimize >>>>>> the HadoopCatalog implementation. Although "HDFS is becoming much less >>>>>> common", looking at GitHub issues and discussion forums, it still has a >>>>>> pretty big user base. >>>>>> >>>>>> (5) In general, I propose we separate the discussion of HadoopCatalog >>>>>> from a "storage only catalog" that also deals with other object stages >>>>>> when >>>>>> evaluating it. With these latest industry developments, we should >>>>>> evaluate >>>>>> the direction for building a storage only Iceberg catalog and see if the >>>>>> community has an interest in that. I could help raise a thread about it >>>>>> after this discussion is closed. >>>>>> >>>>>> Best, >>>>>> Jack Ye >>>>>> >>>>>> [1] >>>>>> https://cloud.google.com/storage/docs/object-versioning#file_restoration_behavior >>>>>> [2] >>>>>> https://cloud.google.com/storage/docs/xml-api/reference-headers#xgoogifgenerationmatch >>>>>> [3] >>>>>> https://learn.microsoft.com/en-us/rest/api/storageservices/specifying-conditional-headers-for-blob-service-operations >>>>>> [4] >>>>>> https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-buckets-overview.html >>>>>> [5] https://cloud.google.com/storage/docs/buckets#enable-hns >>>>>> [6] >>>>>> https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-namespace >>>>>> [7] >>>>>> https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-grants.html >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jul 18, 2024 at 7:16 AM Eduard Tudenhöfner < >>>>>> etudenhoef...@apache.org> wrote: >>>>>> >>>>>>> +1 on deprecating now and removing them from the codebase with >>>>>>> Iceberg 2.0 >>>>>>> >>>>>>> On Thu, Jul 18, 2024 at 10:40 AM Ajantha Bhat <ajanthab...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> +1 on deprecating the `File System Tables` from spec and >>>>>>>> `HadoopCatalog`, `HadoopTableOperations` in code for now >>>>>>>> and removing them permanently during 2.0 release. >>>>>>>> >>>>>>>> For testing we can use `InMemoryCatalog` as others mentioned. >>>>>>>> >>>>>>>> I am not sure about moving to test or keeping them only for HDFS. >>>>>>>> Because, it leads to confusion to existing users of Hadoop catalog. >>>>>>>> >>>>>>>> I wanted to have it deprecated 2 years ago >>>>>>>> <https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1647950504955309> >>>>>>>> and I remember that we discussed it in sync that time and left it as >>>>>>>> it is. >>>>>>>> Also, when the user brought this up in slack >>>>>>>> <https://apache-iceberg.slack.com/archives/C03LG1D563F/p1720075009593789?thread_ts=1719993403.208859&cid=C03LG1D563F> >>>>>>>> recently about lockmanager and refactoring the HadoopTableOperations, >>>>>>>> I have asked to open this discussion on the mailing list. So, that >>>>>>>> we can conclude it once and for all. >>>>>>>> >>>>>>>> - Ajantha >>>>>>>> >>>>>>>> On Thu, Jul 18, 2024 at 12:49 PM Fokko Driesprong <fo...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hey Ryan and others, >>>>>>>>> >>>>>>>>> Thanks for bringing this up. I would be in favor of removing the >>>>>>>>> HadoopTableOperations, mostly because of the reasons that you already >>>>>>>>> mentioned, but also about the fact that it is not fully in line with >>>>>>>>> the >>>>>>>>> first principles of Iceberg (being object store native) as it uses >>>>>>>>> file-listing. >>>>>>>>> >>>>>>>>> I think we should deprecate the HadoopTables to raise the >>>>>>>>> attention of their users. I would be reluctant to move it to test to >>>>>>>>> just >>>>>>>>> use it for testing purposes, I'd rather remove it and replace its use >>>>>>>>> in >>>>>>>>> tests with the InMemoryCatalog. >>>>>>>>> >>>>>>>>> Regarding the StaticTable, this is an easy way to have a read-only >>>>>>>>> table by directly pointing to the metadata. This also lives in Java >>>>>>>>> under >>>>>>>>> StaticTableOperations >>>>>>>>> <https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/StaticTableOperations.java>. >>>>>>>>> It isn't a full-blown catalog where you can list {tables,schemas}, >>>>>>>>> update tables, etc. As ZENOTME pointed out already, it is all up to >>>>>>>>> the >>>>>>>>> user, for example, there is no listing of directories to determine >>>>>>>>> which >>>>>>>>> tables are in the catalog. >>>>>>>>> >>>>>>>>> is there a probability that the strategy used by HadoopCatalog is >>>>>>>>>> not compatible with the table managed by other catalogs? >>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, so they are different, you can see in the spec the section on >>>>>>>>> File >>>>>>>>> System tables >>>>>>>>> <https://github.com/apache/iceberg/blob/main/format/spec.md#file-system-tables>, >>>>>>>>> is used by the HadoopTable implementation. Whereas the other catalogs >>>>>>>>> follow the Metastore Tables >>>>>>>>> <https://github.com/apache/iceberg/blob/main/format/spec.md#metastore-tables> >>>>>>>>> . >>>>>>>>> >>>>>>>>> Kind regards, >>>>>>>>> Fokko >>>>>>>>> >>>>>>>>> Op do 18 jul 2024 om 07:19 schreef NOTME ZE <st810918...@gmail.com >>>>>>>>> >: >>>>>>>>> >>>>>>>>>> According to our requirements, this function is for some users >>>>>>>>>> who want to read iceberg tables without relying on any catalogs, I >>>>>>>>>> think >>>>>>>>>> the StaticTable may be more flexible and clear in semantics. For >>>>>>>>>> StaticTable, it's the user's responsibility to decide which metadata >>>>>>>>>> of the >>>>>>>>>> table to read. But for read-only HadoopCatalog, the metadata may be >>>>>>>>>> decided by Catalog, is there a probability that the strategy used by >>>>>>>>>> HadoopCatalog is not compatible with the table managed by other >>>>>>>>>> catalogs? >>>>>>>>>> >>>>>>>>>> Renjie Liu <liurenjie2...@gmail.com> 于2024年7月18日周四 11:39写道: >>>>>>>>>> >>>>>>>>>>> I think there are two ways to do this: >>>>>>>>>>> 1. As Xuanwo said, we refactor HadoopCatalog to be read only, >>>>>>>>>>> and throw unsupported operation exception for other operations that >>>>>>>>>>> manipulate tables. >>>>>>>>>>> 2. Totally deprecate HadoopCatalog, and add StaticTable as we >>>>>>>>>>> did in pyiceberg or iceberg-rust. >>>>>>>>>>> >>>>>>>>>>> On Thu, Jul 18, 2024 at 11:26 AM Xuanwo <xua...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, Renjie >>>>>>>>>>>> >>>>>>>>>>>> Are you suggesting that we refactor HadoopCatalog as a >>>>>>>>>>>> FileSystemCatalog to enable direct reading from file systems like >>>>>>>>>>>> HDFS, S3, >>>>>>>>>>>> and Azure Blob Storage? This catalog will be read-only that don't >>>>>>>>>>>> support >>>>>>>>>>>> write operations. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jul 18, 2024, at 10:23, Renjie Liu wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi, Ryan: >>>>>>>>>>>> >>>>>>>>>>>> Thanks for raising this. I agree that HadoopCatalog is >>>>>>>>>>>> dangerous in manipulating tables/catalogs given limitations of >>>>>>>>>>>> different >>>>>>>>>>>> file systems. But I see that there are some users who want to read >>>>>>>>>>>> iceberg >>>>>>>>>>>> tables without relying on any catalogs, this is also the >>>>>>>>>>>> motivational use >>>>>>>>>>>> case of StaticTable in pyiceberg and iceberg-rust, is there >>>>>>>>>>>> similar things >>>>>>>>>>>> in java implementation? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jul 18, 2024 at 7:01 AM Ryan Blue <b...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hey everyone, >>>>>>>>>>>> >>>>>>>>>>>> There has been some recent discussion about improving >>>>>>>>>>>> HadoopTableOperations and the catalog based on those tables, but >>>>>>>>>>>> we've >>>>>>>>>>>> discouraged using file system only table (or "hadoop" tables) for >>>>>>>>>>>> years now >>>>>>>>>>>> because of major problems: >>>>>>>>>>>> * It is only safe to use hadoop tables with HDFS; most local >>>>>>>>>>>> file systems, S3, and other common object stores are unsafe >>>>>>>>>>>> * Despite not providing atomicity guarantees outside of HDFS, >>>>>>>>>>>> people use the tables in unsafe situations >>>>>>>>>>>> * HadoopCatalog cannot implement atomic operations for rename >>>>>>>>>>>> and drop table, which are commonly used in data engineering >>>>>>>>>>>> * Alternative file names (for instance when using metadata file >>>>>>>>>>>> compression) also break guarantees >>>>>>>>>>>> >>>>>>>>>>>> While these tables are useful for testing in non-production >>>>>>>>>>>> scenarios, I think it's misleading to have them in the core module >>>>>>>>>>>> because >>>>>>>>>>>> there's an appearance that they are a reasonable choice. I propose >>>>>>>>>>>> we >>>>>>>>>>>> deprecate the HadoopTableOperations and HadoopCatalog >>>>>>>>>>>> implementations and >>>>>>>>>>>> move them to tests the next time we can make breaking API changes >>>>>>>>>>>> (2.0). >>>>>>>>>>>> >>>>>>>>>>>> I think we should also consider similar fixes to the table >>>>>>>>>>>> spec. It currently describes how HadoopTableOperations works, >>>>>>>>>>>> which does >>>>>>>>>>>> not work in object stores or local file systems. HDFS is becoming >>>>>>>>>>>> much less >>>>>>>>>>>> common and I propose that we note that the strategy in the spec >>>>>>>>>>>> should ONLY >>>>>>>>>>>> be used with HDFS. >>>>>>>>>>>> >>>>>>>>>>>> What do other people think? >>>>>>>>>>>> >>>>>>>>>>>> Ryan >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Ryan Blue >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Xuanwo >>>>>>>>>>>> >>>>>>>>>>>> https://xuanwo.io/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> Databricks >>>>> >>>> >>> >>> -- >>> Ryan Blue >>> Databricks >>> >> > > -- > Ryan Blue > Databricks >