Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-12-15 Thread Jon Haddad
At a high level I really like the idea of being able to better leverage cheaper storage especially object stores like S3. One important thing though - I feel pretty strongly that there's a big, deal breaking downside. Backups, disk failure policies, snapshots and possibly repairs would get more

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-12-14 Thread Claude Warren
Is there still interest in this? Can we get some points down on electrons so that we all understand the issues? While it is fairly simple to redirect the read/write to something other than the local system for a single node this will not solve the problem for tiered storage. Tiered storage

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-31 Thread Claude Warren, Jr via dev
@henrik, Have you made any progress on this? I would like to help drive it forward but I am waiting to see what your code looks like and figure out what I need to do. Any update on timeline would be appreciated. On Mon, Oct 23, 2023 at 9:07 PM Jon Haddad wrote: > I think this is a great more

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-23 Thread Jon Haddad
I think this is a great more generally useful than the two scenarios you've outlined. I think it could / should be possible to use an object store as the primary storage for sstables and rely on local disk as a cache for reads. I don't know the roadmap for TCM, but imo if it allowed for more

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-19 Thread Claude Warren, Jr via dev
ay. >>>>> >>>>> Henrik, How does your system work? What is the design strategy? >>>>> Also is your code available somewhere? >>>>> >>>>> After looking at the code some more I think that the best solution is >>>&g

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-18 Thread guo Maxwell
Also >>>> is your code available somewhere? >>>> >>>> After looking at the code some more I think that the best solution is >>>> not a FileChannelProxy but to modify the Cassandra File class to get a >>>> FileSystem object for a Factory

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-18 Thread Claude Warren, Jr via dev
that this makes if very small change that will pick up >>> 90+% of the cases. We then just need to find the edge cases. >>> >>> >>> >>> >>> >>> On Fri, Sep 29, 2023 at 1:14 AM German Eichberger via dev < >>> dev@cassand

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-18 Thread Claude Warren, Jr via dev
gt; On Fri, Sep 29, 2023 at 1:14 AM German Eichberger via dev < >> dev@cassandra.apache.org> wrote: >> >>> Super excited about this as well. Happy to help test with Azure and any >>> other way needed. >>> >>> Thanks, >>> German >&g

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-10-10 Thread Claude Warren, Jr via dev
man >> -- >> *From:* guo Maxwell >> *Sent:* Wednesday, September 27, 2023 7:38 PM >> *To:* dev@cassandra.apache.org >> *Subject:* [EXTERNAL] Re: [DISCUSS] CEP-36: A Configurable ChannelProxy >> to alias external storage locations >> >

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-29 Thread Claude Warren, Jr via dev
; *Subject:* [EXTERNAL] Re: [DISCUSS] CEP-36: A Configurable ChannelProxy > to alias external storage locations > > Thanks , So I think a jira can be created now. And I'd be happy to provide > some help with this as well if needed. > > Henrik Ingo 于2023年9月28日周四 00:21

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-28 Thread German Eichberger via dev
ChannelProxy to alias external storage locations Thanks , So I think a jira can be created now. And I'd be happy to provide some help with this as well if needed. Henrik Ingo mailto:henrik.i...@datastax.com>> 于2023年9月28日周四 00:21写道: It seems I was volunteered to rebase the Astra implemen

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-27 Thread Jeff Jirsa
_ From: Jake Luciani <jak...@gmail.com> Sent: Tuesday, September 26, 2023 19:03 To: dev@cassandra.apache.org Subject: Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations NetApp Security WARNING: This is an external email. Do not click link

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-27 Thread guo Maxwell
Thanks , So I think a jira can be created now. And I'd be happy to provide some help with this as well if needed. Henrik Ingo 于2023年9月28日周四 00:21写道: > It seems I was volunteered to rebase the Astra implementation of this > functionality (FileSystemProvider) onto Cassandra trunk. (And publish

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-27 Thread Henrik Ingo
It seems I was volunteered to rebase the Astra implementation of this functionality (FileSystemProvider) onto Cassandra trunk. (And publish it, of course) I'll try to get going today or tomorrow, so that this discussion can then benefit from having that code available for inspection. And

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-27 Thread Claude Warren, Jr via dev
ge, we might "compact" "TWCS tables" > automatically after so-and-so period by moving them there. > > > From: Jake Luciani > Sent: Tuesday, September 26, 2023 19:03 > To: dev@cassandra.apache.org > Subject: Re: [DISCUSS]

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Miklosovic, Stefan
___ From: Jake Luciani Sent: Tuesday, September 26, 2023 19:03 To: dev@cassandra.apache.org Subject: Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations NetApp Security WARNING: This is an external email. Do not click links or open attachments u

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Jake Luciani
We (DataStax) have a FileSystemProvider for Astra we can provide. Works with S3/GCS/Azure. I'll ask someone on our end to make it accessible. This would work by having a bucket prefix per node. But there are lots of details needed to support things like out of bound compaction (mentioned in

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Benedict
I agree with Ariel, the more suitable insertion point is probably the JDK level FileSystemProvider and FileSystem abstraction. It might also be that we can reuse existing work here in some cases? > On 26 Sep 2023, at 17:49, Ariel Weisberg wrote: > >  > Hi, > > Support for multiple storage

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Ariel Weisberg
Hi, Support for multiple storage backends including remote storage backends is a pretty high value piece of functionality. I am happy to see there is interest in that. I think that `ChannelProxyFactory` as an integration point is going to quickly turn into a dead end as we get into really

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread guo Maxwell
Yeah, there is so much things to do as cassandra (share-nothing) is different from some other system like hbase , So I think we can break the final goal into multiple steps. first is what Claude proposed. But I suggest that this design can make the interface more scalable and we can consider the

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Josh McKenzie
> it may be better to support most cloud storage > It simply only supports S3, which feels a bit customized for a certain user > and is not universal enough.Am I right ? I agree w/the eventual goal (and constraint on design now) of supporting most popular cloud storage vendors, but if we have

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Claude Warren, Jr via dev
The intention of the CEP is to lay the groundwork to allow development of ChannelProxyFactories that are pluggable in Cassandra. In this way any storage system can be a candidate for Cassandra storage provided FileChannels can be created for the system. As I stated before I think that there may

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread guo Maxwell
In my mind , it may be better to support most cloud storage : aws, azure,gcp,aliyun and so on . We may make it a plugable. But in that way, it seems there may need a filesystem interface layer for object storage. And should we support ,distributed system like hdfs ,or something else. We should

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-26 Thread Claude Warren, Jr via dev
My intention is to develop an S3 storage system using https://github.com/carlspring/s3fs-nio There are several issues yet to be solved: 1. There are some internal calls that create files in the table directory that do not use the channel proxy. I believe that these are making calls on

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-25 Thread guo Maxwell
"Rather than building this piece by piece, I think it'd be awesome if someone drew up an end-to-end plan to implement tiered storage, so we can make sure we're discussing the whole final state, and not an implementation detail of one part of the final state?" Do agree with jeff for this ~~~ If

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-25 Thread Jeff Jirsa
- I think this is a great step forward. - Being able to move sstables around between tiers of storage is a feature Cassandra desperately needs, especially if one of those tiers is some sort of object storage - This looks like it's a foundational piece that enables that. Perhaps by a team that's

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-25 Thread Claude Warren, Jr via dev
external storage can be any storage that you can produce a FileChannel for. There is an S3 library that does this so S3 is a definite possibility for storage in this solution. My example code only writes to a different directory on the same system. And there are a couple of places where I did

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-25 Thread guo Maxwell
Great suggestion, Can external storage only be local storage media? Or can it be stored in any storage medium, such as object storage s3 ? We have previously implemented a tiered storage capability, that is, there are multiple storage media on one node, SSD, HDD, and data placement based on

[DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-09-25 Thread Claude Warren, Jr via dev
I have just filed CEP-36 [1] to allow for keyspace/table storage outside of the standard storage space. There are two desires driving this change: 1. The ability to temporarily move some keyspaces/tables to storage outside the normal directory tree to other disk so that compaction can