jonded94 opened a new issue, #17880: URL: https://github.com/apache/datafusion/issues/17880
Hey 👋 I'm currently facing an issue with [parquet-viewer](https://github.com/XiangpengHao/parquet-viewer) with filenames containing more than one dot. 'parquet-viewer' uses Datafusion underlyingly and the error message I'm seeing definitly comes from Datafusion, so I'm vaguely feeling that potentially the issue could lay around here somewhere. For reference, [this](https://github.com/XiangpengHao/parquet-viewer/issues/65) is the issue I'm seeing: if a file is named `test.[random-strings].parquet`, it will lead to this error: ``` Plan( "failed to resolve schema: test", ) ``` At least when I try to reproduce the issue with Datafusion from Python, I can't seem be able to reproduce the issue though: ``` >>> from datafusion import SessionContext >>> ctx = SessionContext() >>> df = ctx.read_parquet("[random-path]/test.ako.parquet") >>> df.show() DataFrame() +----+-----+-----+ | l1 | bar | foo | +----+-----+-----+ | | | 0 | | | 0 | | +----+-----+-----+ >>> df.limit(2) DataFrame() +----+-----+-----+ | l1 | bar | foo | +----+-----+-----+ | | | 0 | | | 0 | | +----+-----+-----+ >>> df.schema() l1: string_view bar: uint64 foo: uint64 ``` I did however find [this](https://github.com/apache/datafusion/blob/3ee52f85fdb94544da04f6a67f0c7fc03c714843/datafusion/catalog/src/listing_schema.rs#L119) line in the Datafusion codebase, which definitely seems fishy to me, as it could lead to problems with multiple parquet files called `part.1.parquet`, `part.2.parquet`? Maybe it is also connected to the issue I'm seeing here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
