Re: [DISCUSS] Chronicle Queue's development model and a hypothetical replacement of the library

David Capwell Thu, 19 Sep 2024 13:43:52 -0700

> Do we need any of these things either? We have our own serialisation 
> framework and file readers and writers, and at least in the FQL case these 
> are the native serialisation format. 
> 
> At cursory glance it also looks to me like this would be a minimal refactor 
> from the current state.
> 
> What is the reason we want to add these other dependencies?

It’s all about the target user of the feature.  I can’t speak for audit logging 
(why do we need more than slf4j?  No clue), but one of the users for Chronicle 
Queue is FQL.  We do have a FQL reply command line, but we are not trying to 
make this a powerful tool with a ton of ways to replay with different rates, 
interleaving, etc… its a basic “I run w/e I see in the logs”.  We have kinda 
moved away from stress tools being in-tree and letting this evolve outside of 
our code base, so looking at this direction it makes sense for FQL reply to be 
external to the Apache Cassandra tree… in fact leveraging existing tools made 
FQL faster and far more powerful than the in-tree version…

With all that, my view of who the target user of FQL data is the 
users/developers and not internal to Apache Cassandra… so I then need to ask 
what is the experience using it…

If we use internal serializers (that are strongly coupled with our internal 
networking) then the user needs to depend on cassandra-all… this brings in 74 
dependencies (see 
https://mvnrepository.com/artifact/org.apache.cassandra/cassandra-all/5.0.0) 
and non of these matter to the user, so you must exclude every single one or 
you just accept w/e we bring in (which means you are stuck with Java Driver 3, 
can’t use Java Driver 4).

Now that you got that out of the way, you can add the log reading into your 
tools and do what you want, as long as they are in Java.

So, using our “serialization framework” only seems to come with burdens to me

1) its versioning is our internal message versioning, so if we make changes to 
our networking FQL is forced to bump its version as well.  If we need to change 
the log format we need to also bump our networking version…. 
2) for the project we add even more public classes to the list we need to 
maintain compatibility with (I have no clue what is public right now, we debate 
this w/e it comes up), so refactoring our CQL processing layer gets harder
3) cassandra-all is massive
4) in order to reuse outside of java we need to implement translations to a 
more common format so other languages can use… I do have tools in python to 
read the Thrift FQL log I write and compute stats on user behavior… it would be 
nice to leverage the log file directly and not have to translate it

> On Sep 19, 2024, at 1:04 PM, Štefan Miklošovič <smikloso...@apache.org> wrote:
> 
> More to it, it is actually not only about FQL. Audit logging is on Chronicle 
> queues too so inspecting that would be platform independent as well. 
> 
> CEP-12 suggests that there might be a persistent store for diagnostic events 
> as well. If somebody wants to inspect what a node was doing after it went 
> offline as for now all these events are in memory only.
> 
> This would basically enable people to fully inspect what the cluster was 
> doing from FQL to Audit to Diagnostics in a language independent manner. 
> 
> On Thu, Sep 19, 2024 at 9:50 PM Štefan Miklošovič <smikloso...@apache.org 
> <mailto:smikloso...@apache.org>> wrote:
>> I think the biggest selling point for using something like protobuf is what 
>> David said - what if he wants to replay it in Go? Basing it on something 
>> language neutral enables people to replay it in whatever they want. If we 
>> have something totally custom then it is replayable just in Java without 
>> bringing tons of dependencies to their projects. That is the message I got 
>> from what he wrote. 
>> 
>> On Thu, Sep 19, 2024 at 9:47 PM Benedict <bened...@apache.org 
>> <mailto:bened...@apache.org>> wrote:
>>> Do we need any of these things either? We have our own serialisation 
>>> framework and file readers and writers, and at least in the FQL case these 
>>> are the native serialisation format. 
>>> 
>>> At cursory glance it also looks to me like this would be a minimal refactor 
>>> from the current state.
>>> 
>>> What is the reason we want to add these other dependencies?
>>> 
>>> 
>>>> On 19 Sep 2024, at 20:31, Štefan Miklošovič <smikloso...@apache.org 
>>>> <mailto:smikloso...@apache.org>> wrote:
>>>> 
>>>> 
>>>> well the Maven plugin declares that it downloads protoc from Maven Central 
>>>> automatically _somehow_ so coding up an Ant task which does something 
>>>> similar shouldn't be too hard. I will investigate this idea. 
>>>> 
>>>> On Thu, Sep 19, 2024 at 9:26 PM Brandon Williams <dri...@gmail.com 
>>>> <mailto:dri...@gmail.com>> wrote:
>>>>> On Thu, Sep 19, 2024 at 2:16 PM Štefan Miklošovič
>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote:
>>>>> > Unfortunately there is nothing like that for Ant, protoc would need to 
>>>>> > be a local dependency on the computer which compiles the project to be 
>>>>> > able to do that so that is kind of a dead end. Or is there any 
>>>>> > workaround here?
>>>>> 
>>>>> In the old thrift days I believe we generated the code and checked it
>>>>> in so you didn't need to compile locally.
>>>>> 
>>>>> Kind Regards,
>>>>> Brandon

Re: [DISCUSS] Chronicle Queue's development model and a hypothetical replacement of the library

Reply via email to