At a high level I really like the idea of being able to better leverage
cheaper storage especially object stores like S3.
One important thing though - I feel pretty strongly that there's a big,
deal breaking downside. Backups, disk failure policies, snapshots and
possibly repairs would get more
Is there still interest in this? Can we get some points down on electrons so
that we all understand the issues?
While it is fairly simple to redirect the read/write to something other than
the local system for a single node this will not solve the problem for tiered
storage.
Tiered storage
@henrik, Have you made any progress on this? I would like to help drive
it forward but I am waiting to see what your code looks like and figure out
what I need to do. Any update on timeline would be appreciated.
On Mon, Oct 23, 2023 at 9:07 PM Jon Haddad
wrote:
> I think this is a great more
I think this is a great more generally useful than the two scenarios you've
outlined. I think it could / should be possible to use an object store as the
primary storage for sstables and rely on local disk as a cache for reads.
I don't know the roadmap for TCM, but imo if it allowed for more
ay.
>>>>>
>>>>> Henrik, How does your system work? What is the design strategy?
>>>>> Also is your code available somewhere?
>>>>>
>>>>> After looking at the code some more I think that the best solution is
>>>&g
Also
>>>> is your code available somewhere?
>>>>
>>>> After looking at the code some more I think that the best solution is
>>>> not a FileChannelProxy but to modify the Cassandra File class to get a
>>>> FileSystem object for a Factory
that this makes if very small change that will pick up
>>> 90+% of the cases. We then just need to find the edge cases.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Sep 29, 2023 at 1:14 AM German Eichberger via dev <
>>> dev@cassand
gt; On Fri, Sep 29, 2023 at 1:14 AM German Eichberger via dev <
>> dev@cassandra.apache.org> wrote:
>>
>>> Super excited about this as well. Happy to help test with Azure and any
>>> other way needed.
>>>
>>> Thanks,
>>> German
>&g
man
>> --
>> *From:* guo Maxwell
>> *Sent:* Wednesday, September 27, 2023 7:38 PM
>> *To:* dev@cassandra.apache.org
>> *Subject:* [EXTERNAL] Re: [DISCUSS] CEP-36: A Configurable ChannelProxy
>> to alias external storage locations
>>
>
; *Subject:* [EXTERNAL] Re: [DISCUSS] CEP-36: A Configurable ChannelProxy
> to alias external storage locations
>
> Thanks , So I think a jira can be created now. And I'd be happy to provide
> some help with this as well if needed.
>
> Henrik Ingo 于2023年9月28日周四 00:21
ChannelProxy to alias
external storage locations
Thanks , So I think a jira can be created now. And I'd be happy to provide some
help with this as well if needed.
Henrik Ingo mailto:henrik.i...@datastax.com>>
于2023年9月28日周四 00:21写道:
It seems I was volunteered to rebase the Astra implemen
_
From: Jake Luciani <jak...@gmail.com>
Sent: Tuesday, September 26, 2023 19:03
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations
NetApp Security WARNING: This is an external email. Do not click link
Thanks , So I think a jira can be created now. And I'd be happy to provide
some help with this as well if needed.
Henrik Ingo 于2023年9月28日周四 00:21写道:
> It seems I was volunteered to rebase the Astra implementation of this
> functionality (FileSystemProvider) onto Cassandra trunk. (And publish
It seems I was volunteered to rebase the Astra implementation of this
functionality (FileSystemProvider) onto Cassandra trunk. (And publish it,
of course) I'll try to get going today or tomorrow, so that this
discussion can then benefit from having that code available for inspection.
And
ge, we might "compact" "TWCS tables"
> automatically after so-and-so period by moving them there.
>
>
> From: Jake Luciani
> Sent: Tuesday, September 26, 2023 19:03
> To: dev@cassandra.apache.org
> Subject: Re: [DISCUSS]
___
From: Jake Luciani
Sent: Tuesday, September 26, 2023 19:03
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external
storage locations
NetApp Security WARNING: This is an external email. Do not click links or open
attachments u
We (DataStax) have a FileSystemProvider for Astra we can provide.
Works with S3/GCS/Azure.
I'll ask someone on our end to make it accessible.
This would work by having a bucket prefix per node. But there are lots
of details needed to support things like out of bound compaction
(mentioned in
I agree with Ariel, the more suitable insertion point is probably the JDK level
FileSystemProvider and FileSystem abstraction.
It might also be that we can reuse existing work here in some cases?
> On 26 Sep 2023, at 17:49, Ariel Weisberg wrote:
>
>
> Hi,
>
> Support for multiple storage
Hi,
Support for multiple storage backends including remote storage backends is a
pretty high value piece of functionality. I am happy to see there is interest
in that.
I think that `ChannelProxyFactory` as an integration point is going to quickly
turn into a dead end as we get into really
Yeah, there is so much things to do as cassandra (share-nothing) is
different from some other system like hbase , So I think we can break the
final goal into multiple steps. first is what Claude proposed. But I
suggest that this design can make the interface more scalable and we can
consider the
> it may be better to support most cloud storage
> It simply only supports S3, which feels a bit customized for a certain user
> and is not universal enough.Am I right ?
I agree w/the eventual goal (and constraint on design now) of supporting most
popular cloud storage vendors, but if we have
The intention of the CEP is to lay the groundwork to allow development of
ChannelProxyFactories that are pluggable in Cassandra. In this way any
storage system can be a candidate for Cassandra storage provided
FileChannels can be created for the system.
As I stated before I think that there may
In my mind , it may be better to support most cloud storage : aws,
azure,gcp,aliyun and so on . We may make it a plugable. But in that way, it
seems there may need a filesystem interface layer for object storage. And
should we support ,distributed system like hdfs ,or something else. We
should
My intention is to develop an S3 storage system using
https://github.com/carlspring/s3fs-nio
There are several issues yet to be solved:
1. There are some internal calls that create files in the table
directory that do not use the channel proxy. I believe that these are
making calls on
"Rather than building this piece by piece, I think it'd be awesome if
someone drew up an end-to-end plan to implement tiered storage, so we can
make sure we're discussing the whole final state, and not an implementation
detail of one part of the final state?"
Do agree with jeff for this ~~~ If
- I think this is a great step forward.
- Being able to move sstables around between tiers of storage is a feature
Cassandra desperately needs, especially if one of those tiers is some sort
of object storage
- This looks like it's a foundational piece that enables that. Perhaps by a
team that's
external storage can be any storage that you can produce a FileChannel
for. There is an S3 library that does this so S3 is a definite
possibility for storage in this solution. My example code only writes to a
different directory on the same system. And there are a couple of places
where I did
Great suggestion, Can external storage only be local storage media? Or can
it be stored in any storage medium, such as object storage s3 ?
We have previously implemented a tiered storage capability, that is, there
are multiple storage media on one node, SSD, HDD, and data placement based
on
I have just filed CEP-36 [1] to allow for keyspace/table storage outside of
the standard storage space.
There are two desires driving this change:
1. The ability to temporarily move some keyspaces/tables to storage
outside the normal directory tree to other disk so that compaction can
29 matches
Mail list logo