coolderli commented on issue #5226:
URL: https://github.com/apache/gravitino/issues/5226#issuecomment-2441473770

   It is difficult to define the relationship between fileset, dataset, and 
model. If we introduce a new dataset catalog and model catalog, it may be a bit 
confusing for users. In addition, datasets or models may only be applicable to 
Python APIs in machine learning scenarios.
   
   I am more inclined to provide a dataset and model API based on the fileset. 
We can record more metadata information in fileset, such as schema. And when we 
use the dataset api, we can use nore metadata information.
   ```python
   # using gvfs api to read the fileset
   gvfs.open('gvfs://xxx')
   ```
   
   ``` python
   # using dataset api to read the fileset
   dataset = datasets.IterableDataset.("catalog.schema.fileset", version='1')
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to