[ 
https://issues.apache.org/jira/browse/AVRO-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778611#comment-17778611
 ] 

Oscar Westra van Holthe - Kind commented on AVRO-3890:
------------------------------------------------------

Personally, I'm not certain about this feature. One question that comes to mind 
is this:

*What is the logical distinction between a main and non-main schema?*

 

To me, it looks like the {{isImported()}} distinction is incidental (as is the 
use of a protocol), and not material. The reason is that your example shows a 
main schema ({{{}purchase.Invoice{}}}) with two additional schemata 
({{{}customer.Customer{}}} and {{{}product.Item{}}}), both of which are not a 
main schema because they are referenced.

 

If my opinion is correct, you'll welcome the next major release, which adds 
syntax like this:
{code:java}
namespace purchase;
// This is the "main" schema of this file.
schema Invoice;

record Invoice{
    timestamp_ms purchaseTime;
    customer.Customer customer;
    array<product.Item> items = [];
}

record customer.Customer {
    // Field definitions definitions
}

record product.Item {
    // Field definitions definitions
}{code}
More generally, you can determine the main schemata from a list of schemata 
(including referenced schemata not in the list) like this:
{code:java}
public List<Schema> findMainSchemata(List<Schema> schemas) {
    Map<Schema, LinkedHashSet<Schema>> refereesPerSchema = 
listRefereesPerSchema(schemas);
    return findMainSchemata(refereesPerSchema);
}

public Map<Schema, LinkedHashSet<Schema>> listRefereesPerSchema(List<Schema> 
schemas) {
    Map<Schema, LinkedHashSet<Schema>> refereesPerSchema = new 
LinkedHashMap<>();
    Set<Schema> examinedSchemata = new HashSet<>();
    Deque<Schema> toCheck = new ArrayDeque<>();
    SchemaVisitor<Void> findUsersOfSchemata = new SchemaVisitor<Void>() {
        private SchemaVisitorAction visitSchema(Schema schema) {
            refereesPerSchema.computeIfAbsent(schema, ignored -> new 
LinkedHashSet<>()).add(toCheck.element());
            if (examinedSchemata.add(schema)) {
                // We haven't seen this schema yet
                toCheck.add(schema);
            }
            return SchemaVisitorAction.CONTINUE;
        }

        @Override
        public SchemaVisitorAction visitTerminal(Schema schema) {
            return visitSchema(schema);
        }

        @Override
        public SchemaVisitorAction visitNonTerminal(Schema schema) {
            return visitSchema(schema);
        }

        @Override
        public SchemaVisitorAction afterVisitNonTerminal(Schema schema) {
            return SchemaVisitorAction.CONTINUE;
        }

        @Override
        public Void get() {
            return null;
        }
    };

    examinedSchemata.addAll(schemas);
    toCheck.addAll(schemas);
    while (!toCheck.isEmpty()) {
        Schemas.visit(toCheck.element(), findUsersOfSchemata);
        toCheck.pop();
    }
    return refereesPerSchema;
}

public List<Schema> findMainSchemata(Map<Schema, LinkedHashSet<Schema>> 
refereesPerSchema) {
    List<Schema> mainSchemata = new ArrayList<>();
    Set<Schema> schemasInCycles = new HashSet<>();
    for (Map.Entry<Schema, LinkedHashSet<Schema>> entry : 
refereesPerSchema.entrySet()) {
        Schema schema = entry.getKey();
        LinkedHashSet<Schema> referees = entry.getValue();
        if (referees.isEmpty()) {
            // The schema is never referenced
            mainSchemata.add(schema);
        } else if (referees.contains(schema) && 
!schemasInCycles.contains(schema)) {
            // The schema (also) references itself: it is part of a cycle
            // But it is not part of a known cycle, so we accept it as a main 
schema
            mainSchemata.add(schema);
            schemasInCycles.addAll(referees);
        }
        // If no branch matched, the schema is referenced somewhere and not the 
main schema in a loop.
    }
    return mainSchemata;
}
{code}

Note that I haven't tested this code yet.

> Add feature to know if schema is imported in avdl
> -------------------------------------------------
>
>                 Key: AVRO-3890
>                 URL: https://issues.apache.org/jira/browse/AVRO-3890
>             Project: Apache Avro
>          Issue Type: New Feature
>          Components: java
>    Affects Versions: 1.11.2, 1.11.3
>            Reporter: Thun Hak
>            Priority: Minor
>
> consider the this invoice.avdl:
> {code:java}
> @namespace("purchase")
> protocol InvoiceProtocol{
>    import idl "../customer/data.avdl";  
>    import idl "../product/data.avdl";    
>    record Invoice{
>       timestamp_ms purchaseTime;
>       customer.Customer customer;
>       array<product.Item> items = [];
>    }
> }{code}
> now, if I do this:
> {code:java}
> for(Schema s : protocol.getTypes()){
>    System.out.println(s.getName() + " " + s.isImported());
> }{code}
> this should produce
> {code:java}
> Customer true
> Item true
> Invoice false{code}
> I have some use case where we want to perform operations only to the "main" 
> schema and leave all imported schema alone.
> This feature (s.isImported()) would be really nice to have (if it's not 
> available already).
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to