Hi, so have a case where we have

data set 1 with schema and a field - { "name": "x", "type" : "string" }
we have app1 and it does .get("x") generic retrieval
This application becomes long lived and we don't want (maybe can't) change it.

We want to change the name of the field. Lets say our new field name is "y" ... 
according to docs/specs we are supposed to add that to aliases... A new 
producer can create data referencing the improved name “y” and an old consumer 
can go on thinking in terms of a “x” without having to do any work.

The problem is the world changes and really the context of that field name 
should be "y" and not "x". We want to-do this because the context of the schema 
should make sense and context for current state is important. e.g. we used to 
call it "horse_drawn_carriage" and now we want to call it "automobile" 
(pda->mobile_device (lots of things change over time in context) ... there are 
lots of real world examples that I don't/can't want to get into the weeds about 
hopefully my two random ones are enough to help illustrate the problem is 
real...  we also have cases where over time the name will likely change again 
so if we kept using the current approach and add more to aliases you don't know 
which one of those aliases is really the current one which is why we favor 
field name to be current context.

so we do

data set 2 with schema and a field - { "name": "y", "type" : "string", 
"aliases" :["x"]}
we have app2 and it does .get("y") generic retrieval because that is how folks 
now know to build their apps. The problem is.... aliases are not bidirectional. 
So we can't reference "x" to get at our data in the old app which breaks :(

So we came up with a patch that handles this ~ roughly ~

public static Object resolveField(GenericRecord genericRecord, String 
fieldName) {
        for (Schema.Field field : genericRecord.getSchema().getFields()) {
            if (field.name().equals(fieldName)) { return 
genericRecord.get(fieldName); }

            for (String alias : field.aliases()) {
                if (fieldName.equals(alias)) { return 
genericRecord.get(field.name()); }
            }
        }

        return null;
    }

I wanted to check first if we were missing something as we were going through 
this or doing something by changing alias in a way that the community believes 
is at odds with some principles we were not understanding or properly grocking? 
I am very open minded that we have gone down the wrong path here however it 
does seem to solve the core problem we have with keeping context of the schema 
current. I could see how this problem is not just us or our use case and one 
that others have too.

If folks are in sync with this change I would like to propose/create a patch 
and see about making aliases work bi-directionally allowing folks to use the 
name field as "the current context of the name of the thing" where the list of 
aliases are historic items.

Thoughts?

Regards,

~ Joe Stein

Reply via email to