Re: Iceberg Public Dataset Launch

Manu Zhang Sun, 08 Mar 2026 21:26:31 -0700

Hi Talat,

Thanks for sharing the gist. It seems that the
IRCReadRequestsPerMinutePerProject quota can be easily exceeded. I
encountered the following error while loading the table:


Error loading table: RESTError 429: Received unexpected JSON Payload: {
  "error": {
    "code": 429,
    "message": "Quota exceeded for quota metric 'Iceberg REST Catalog read
requests' and limit 'Iceberg REST Catalog read requests per minute' of
service 'biglake.googleapis.com' for consumer
'project_number:1057666841514'.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "RATE_LIMIT_EXCEEDED",
        "domain": "googleapis.com",
        "metadata": {
          "quota_limit": "IRCReadRequestsPerMinutePerProject",
          "consumer": "projects/1057666841514",
          "quota_location": "global",
          "quota_unit": "1/min/{project}",
          "quota_limit_value": "300",
          "service": "biglake.googleapis.com",
          "quota_metric": "biglake.googleapis.com/irc_read_requests"
        }
      },
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Request a higher quota limit.",
            "url": "
https://cloud.google.com/docs/quotas/help/request_increase";
          }
        ]
      }
    ]
  }
}

Do you have any suggestions on how to handle this?

Regards,
Manu

On Thu, Jan 22, 2026 at 7:33 AM Talat Uyarer via dev <[email protected]>
wrote:

> Hi Steve,
>
> The public dataset is accessible from anywhere. BigLake offers a free tier
> with the first 50,000 requests being free each month [1]. While not
> entirely free, it's essentially "freeish." I'm uncertain about egress
> charges. When using the dataset, users must specify a project that will be
> billed. However, based on my personal experience with my project, I haven't
> incurred any charges. I know spinning up a Spark cluster is not a big deal
> for you, but if you want to give it a fast try, I also created a gist with
> pyiceberg [2].
>
> [1] https://cloud.google.com/products/biglake/pricing
> [2] https://gist.github.com/talatuyarer/02568a38a7630434556e7dc1f0a5ab40
>
> On Wed, Jan 21, 2026 at 5:31 AM Steve Loughran <[email protected]>
> wrote:
>
>>
>> are these remotely accessible? and who pays?
>>
>> I'm just thinking of whether its an datasource for regression testing.
>>
>> For s3a we use public (free) parquet datasets for some of the scale read
>> testing...keeps setup time minimal and stops "needs a few hundred MB of
>> data in s3" as a cost blocker to contributors (*).
>>
>> It'd be nice to have public iceberg datasets in the various stores for
>> similar regression tests
>>
>> steve
>>
>> (*) we use NOAA data, luckily the s3 bucket hasn't been decommissioned by
>> the US govt, though I did worry about that last year
>>
>> On Wed, 14 Jan 2026 at 21:27, Alex Stephen via dev <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> We just launched a public dataset (backed by a public Iceberg REST
>>> Catalog) that can be accessed by any Iceberg-enabled query engine. The goal
>>> is for Iceberg developers to begin diving into the ecosystem without
>>> bootstrapping a full catalog and creating data.
>>>
>>> We'd love to hear any of your thoughts on how we can improve it.
>>>
>>> Announcement blog post
>>> <https://opensource.googleblog.com/2026/01/explore-public-datasets-with-apache-iceberg-and-biglake.html>
>>> Example PySpark script
>>> <https://gist.github.com/rambleraptor/7fd2fd55a208da7e5c000430d54d8db4>
>>>
>>> Thanks!
>>>
>>> -- Alex Stephen
>>>
>>

Re: Iceberg Public Dataset Launch

Reply via email to