Re: 0.3 and the OOM gremlin (CASSANDRA-208)

Jonathan Ellis Thu, 04 Jun 2009 09:04:04 -0700

A client-side upgrade script would be trivial.  (note that postgresql
has gotten by with basically this approach for many years now.)


for key in old_client.get_key_range(everything):
  columns = old_client.get_slice or get_slice_super(key, all columns)
  new_client.batch_insert or batch_insert_super(columns)

Obviously I'm glossing over a few things here but anyone competent
could bang this out in a day or two.

Binary upgrading is substantially more complicated.

I'm happy to leave either or both of these approaches as a "scratch
your itch" for the first organization who wants it badly enough. :)

-Jonathan

On Thu, Jun 4, 2009 at 10:55 AM, Johan Oskarsson <[email protected]> wrote:
> +1 for getting an Apache release out there as soon as possible to show
> that the project is alive.
>
> If we can resolve the following in some way I think it's ok to push this
> issue to 0.4.0:
>
> * We should make sure that end users are aware of this bug, in a known
> issues file or the readme for example, with a link to the jira ticket
> and a description of how it happens and how to avoid it.
> * Write up how each version is compatible with each other, as mentioned
> on IRC the 0.3.0 and 0.4.0 data files would not be compatible.
> * Work out roughly how common this problem will be, if all new users
> will hit it the release won't really be of much use.
> * Since the data files will be incompatible between versions, do we plan
> on bundling an upgrade tool? If not now, when? After 1.0?
>
> /Johan
>
> Jonathan Ellis wrote:
>> So, in light of Sandeep's point, I think I would prefer to do 0.3 now,
>> and try to do a short 0.4 cycle with current trunk and
>>
>>  - the sstable redesign to address OOM problem
>>  - multitable support
>>  - patch to reduce logging impact so we look better in benchmarks :)
>>  - fsync fix
>>  - r/m old get_slice and rename get_slice_from to get_slice
>>
>> How does that sound?
>>
>> -Jonathan
>>
>> On Wed, Jun 3, 2009 at 4:59 PM, Jonathan Ellis <[email protected]> wrote:
>>> You are right.  Of course, there's no sense in making such a tool
>>> harder to write than it needs to be.
>>>
>>> But I don't care that strongly since I won't be writing it. :P
>>>
>>> -Jonathan
>>>
>>> On Wed, Jun 3, 2009 at 4:53 PM, Sandeep Tata <[email protected]> wrote:
>>>> Won't things like multi-table support break binary compatibility anyway?
>>>>
>>>> We might be stuck with having to write a tool that migrates from a 0.3
>>>> format to a 0.4 format.
>>>>
>>>>
>>>> On Wed, Jun 3, 2009 at 2:44 PM, Jonathan Ellis <[email protected]> wrote:
>>>>> The fix for 208 [1] is fairly invasive.  should we
>>>>>
>>>>> (a) release another RC and do more testing before 0.3 final, or
>>>>> (b) release 0.3 without these changes, and only add this fix to trunk?
>>>>>
>>>>> Although I see the 0.3 release primarily as a means to let people
>>>>> start playing with the cassandra data model, I don't know that I want
>>>>> to release it knowing that 0.4 is going to be binary-incompatible with
>>>>> the 0.3 data files.  So I'd be inclined towards (a).
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-208
>>>>>
>>>>> -Jonathan
>>>>>
>
>

Re: 0.3 and the OOM gremlin (CASSANDRA-208)

Reply via email to