Hi Talat,
Thanks for sharing the gist. It seems that the
IRCReadRequestsPerMinutePerProject quota can be easily exceeded. I
encountered the following error while loading the table:
Error loading table: RESTError 429: Received unexpected JSON Payload: {
"error": {
"code": 429,
"message": "Quota exceeded for quota metric 'Iceberg REST Catalog read
requests' and limit 'Iceberg REST Catalog read requests per minute' of
service 'biglake.googleapis.com' for consumer
'project_number:1057666841514'.",
"status": "RESOURCE_EXHAUSTED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"reason": "RATE_LIMIT_EXCEEDED",
"domain": "googleapis.com",
"metadata": {
"quota_limit": "IRCReadRequestsPerMinutePerProject",
"consumer": "projects/1057666841514",
"quota_location": "global",
"quota_unit": "1/min/{project}",
"quota_limit_value": "300",
"service": "biglake.googleapis.com",
"quota_metric": "biglake.googleapis.com/irc_read_requests"
}
},
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Request a higher quota limit.",
"url": "
https://cloud.google.com/docs/quotas/help/request_increase"
}
]
}
]
}
}
Do you have any suggestions on how to handle this?
Regards,
Manu
On Thu, Jan 22, 2026 at 7:33 AM Talat Uyarer via dev <[email protected]>
wrote:
> Hi Steve,
>
> The public dataset is accessible from anywhere. BigLake offers a free tier
> with the first 50,000 requests being free each month [1]. While not
> entirely free, it's essentially "freeish." I'm uncertain about egress
> charges. When using the dataset, users must specify a project that will be
> billed. However, based on my personal experience with my project, I haven't
> incurred any charges. I know spinning up a Spark cluster is not a big deal
> for you, but if you want to give it a fast try, I also created a gist with
> pyiceberg [2].
>
> [1] https://cloud.google.com/products/biglake/pricing
> [2] https://gist.github.com/talatuyarer/02568a38a7630434556e7dc1f0a5ab40
>
> On Wed, Jan 21, 2026 at 5:31 AM Steve Loughran <[email protected]>
> wrote:
>
>>
>> are these remotely accessible? and who pays?
>>
>> I'm just thinking of whether its an datasource for regression testing.
>>
>> For s3a we use public (free) parquet datasets for some of the scale read
>> testing...keeps setup time minimal and stops "needs a few hundred MB of
>> data in s3" as a cost blocker to contributors (*).
>>
>> It'd be nice to have public iceberg datasets in the various stores for
>> similar regression tests
>>
>> steve
>>
>> (*) we use NOAA data, luckily the s3 bucket hasn't been decommissioned by
>> the US govt, though I did worry about that last year
>>
>> On Wed, 14 Jan 2026 at 21:27, Alex Stephen via dev <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> We just launched a public dataset (backed by a public Iceberg REST
>>> Catalog) that can be accessed by any Iceberg-enabled query engine. The goal
>>> is for Iceberg developers to begin diving into the ecosystem without
>>> bootstrapping a full catalog and creating data.
>>>
>>> We'd love to hear any of your thoughts on how we can improve it.
>>>
>>> Announcement blog post
>>> <https://opensource.googleblog.com/2026/01/explore-public-datasets-with-apache-iceberg-and-biglake.html>
>>> Example PySpark script
>>> <https://gist.github.com/rambleraptor/7fd2fd55a208da7e5c000430d54d8db4>
>>>
>>> Thanks!
>>>
>>> -- Alex Stephen
>>>
>>