> For example, we store the metadata into S3(standard S3, S3-liked 
> service), and we have swift, hdfs and many other backend serivces. 

The primary focus of idea 2 is to cache metadata rather than storing it. 
Thus, metadata is fetched lazily, allowing users to restart the daemon whenever 
necessary.

On Tue, Mar 12, 2024, at 21:44, Manjusaka wrote:
> On 2024/3/12 20:56, 余润杰 wrote:
>> Thank you very much for your suggestions. This should be a topic that
>> requires thorough discussion, involving the design philosophy and
>> positioning of ovirtiofs.
>> 
>> My previous design was based on referencing the classic architecture of
>> distributed file systems. The previous proposal stored additional metadata
>> through a KV database. Thanks to KV databases like leveldb, rocksdb, which
>> are used in the form of library and database files, users or ovirtiofs do
>> not need to maintain a separate service process. I believe that, at this
>> point, ovirtiofs transfers its state to the database files and the backend
>> object storage, making the process itself stateless. With configuration
>> files and database files (including additional metadata), ovirtiofs does
>> not maintain any state information in memory, allowing it to be started and
>> restarted arbitrarily. However, users still need to be aware of the
>> existence and significance of the database files, and it is challenging to
>> maintain the state synchronization overhead introduced by external changes
>> to the object storage system. And external changes may be difficult to
>> directly incorporate into the ovirtiofs directory tree, requiring special
>> handling rules.
>> 
>> If we do not consider metadata persistence, ovirtiofs needs to retrieve
>> file system state information from the object storage when restarting. In
>> this scenario, we need to make some assumptions to restore the file system
>> interface. For example, the name of a bucket represents a complete
>> directory path, and the objects in the bucket represent the files in the
>> directory. The implementation of a file system based on such assumptions
>> has certain limitations, including potential performance issues such as
>> uneven object distribution and escaping of metadata operations such as
>> directory traversal. However, the benefit is that ovirtiofs only needs a
>> configuration file to restart and recover, providing a stateless service
>> that can share directories among multiple virtual machines on multiple
>> physical nodes, which is difficult to achieve in the first design. In this
>> case, we do not need to consider the state changes brought about by
>> external modifications to the storage system data, as all state information
>> is managed through the object storage system.
>> 
>> After thinking about it, I now support the second idea, which is to
>> implement the file system interface through assumptions and without
>> persistence, because at this time ovirtiofs has greater usage prospects. I
>> would like to modify the proposal in these directions, modify the metadata
>> management design, add a description of stateless service support, and add
>> a description of document writing and usage scenarios.
>> 
>> Xuanwo <[email protected]> 于2024年3月11日周一 22:50写道:
>> 
>>> Great proposal.
>>>
>>> My only question is, can we avoid the persistence of metadata?
>>>
>>> I'm thinking of two things:
>>>
>>> - I expect virtio to be stateless and easy to recover and deploy, users
>>> don't need to maintain extra stateful services.
>>> - External changes to storage services such as S3 and GCS can create
>>> additional synchronization work.
>>>
>>> On Mon, Mar 11, 2024, at 22:44, 余润杰 wrote:
>>>> Greetings, everyone!
>>>>
>>>> I'm Runjie Yu, a student at Huazhong University of Science and
>>> Technology. I would like to participate in the OpenDAL GSoC project as a
>>> candidate, and I've already prepared a proposal draft. I plan to refine
>>> this draft further and would appreciate to receive suggestions before
>>> proceeding. I also hope to verify if my design makes sense and meets
>>> expectations.
>>>>
>>>> This project aims to implement shared directories for virtual machines
>>> based on the OpenDAL using virtio technology. This is the relevant issue
>>> link: https://github.com/apache/opendal/issues/4133.
>>>>
>>>> *Attachments:*
>>>>  • zjregee_GSoC_2024_OpenDAL_Project_Proposal_Draft.md
>>> Xuanwo
>>>
>> 
>
>
> Great discussion! I have some experience about VirtioFS before. So I 
> can mentor this proposal with Xuanwo at the same time.
>
> For me, here's a tough question personal about your second draft design
>
>> ovirtiofs needs to retrieve file system state information from the object 
>> storage when restarting
>
> I think this means that we need to depend on the network status for the 
> service. 
>
> For example, we store the metadata into S3(standard S3, S3-liked 
> service), and we have swift, hdfs and many other backend serivces. 
> I think we need to keep the minimal function still working if the S3 
> has been crashed or we got network issues.
>
> This is just my personal thought. Feel free to ask if you got any issues
>
> Best
>
> Manjusaka

-- 
Xuanwo

Reply via email to