[
https://issues.apache.org/jira/browse/AVRO-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778611#comment-17778611
]
Oscar Westra van Holthe - Kind commented on AVRO-3890:
------------------------------------------------------
Personally, I'm not certain about this feature. One question that comes to mind
is this:
*What is the logical distinction between a main and non-main schema?*
To me, it looks like the {{isImported()}} distinction is incidental (as is the
use of a protocol), and not material. The reason is that your example shows a
main schema ({{{}purchase.Invoice{}}}) with two additional schemata
({{{}customer.Customer{}}} and {{{}product.Item{}}}), both of which are not a
main schema because they are referenced.
If my opinion is correct, you'll welcome the next major release, which adds
syntax like this:
{code:java}
namespace purchase;
// This is the "main" schema of this file.
schema Invoice;
record Invoice{
timestamp_ms purchaseTime;
customer.Customer customer;
array<product.Item> items = [];
}
record customer.Customer {
// Field definitions definitions
}
record product.Item {
// Field definitions definitions
}{code}
More generally, you can determine the main schemata from a list of schemata
(including referenced schemata not in the list) like this:
{code:java}
public List<Schema> findMainSchemata(List<Schema> schemas) {
Map<Schema, LinkedHashSet<Schema>> refereesPerSchema =
listRefereesPerSchema(schemas);
return findMainSchemata(refereesPerSchema);
}
public Map<Schema, LinkedHashSet<Schema>> listRefereesPerSchema(List<Schema>
schemas) {
Map<Schema, LinkedHashSet<Schema>> refereesPerSchema = new
LinkedHashMap<>();
Set<Schema> examinedSchemata = new HashSet<>();
Deque<Schema> toCheck = new ArrayDeque<>();
SchemaVisitor<Void> findUsersOfSchemata = new SchemaVisitor<Void>() {
private SchemaVisitorAction visitSchema(Schema schema) {
refereesPerSchema.computeIfAbsent(schema, ignored -> new
LinkedHashSet<>()).add(toCheck.element());
if (examinedSchemata.add(schema)) {
// We haven't seen this schema yet
toCheck.add(schema);
}
return SchemaVisitorAction.CONTINUE;
}
@Override
public SchemaVisitorAction visitTerminal(Schema schema) {
return visitSchema(schema);
}
@Override
public SchemaVisitorAction visitNonTerminal(Schema schema) {
return visitSchema(schema);
}
@Override
public SchemaVisitorAction afterVisitNonTerminal(Schema schema) {
return SchemaVisitorAction.CONTINUE;
}
@Override
public Void get() {
return null;
}
};
examinedSchemata.addAll(schemas);
toCheck.addAll(schemas);
while (!toCheck.isEmpty()) {
Schemas.visit(toCheck.element(), findUsersOfSchemata);
toCheck.pop();
}
return refereesPerSchema;
}
public List<Schema> findMainSchemata(Map<Schema, LinkedHashSet<Schema>>
refereesPerSchema) {
List<Schema> mainSchemata = new ArrayList<>();
Set<Schema> schemasInCycles = new HashSet<>();
for (Map.Entry<Schema, LinkedHashSet<Schema>> entry :
refereesPerSchema.entrySet()) {
Schema schema = entry.getKey();
LinkedHashSet<Schema> referees = entry.getValue();
if (referees.isEmpty()) {
// The schema is never referenced
mainSchemata.add(schema);
} else if (referees.contains(schema) &&
!schemasInCycles.contains(schema)) {
// The schema (also) references itself: it is part of a cycle
// But it is not part of a known cycle, so we accept it as a main
schema
mainSchemata.add(schema);
schemasInCycles.addAll(referees);
}
// If no branch matched, the schema is referenced somewhere and not the
main schema in a loop.
}
return mainSchemata;
}
{code}
Note that I haven't tested this code yet.
> Add feature to know if schema is imported in avdl
> -------------------------------------------------
>
> Key: AVRO-3890
> URL: https://issues.apache.org/jira/browse/AVRO-3890
> Project: Apache Avro
> Issue Type: New Feature
> Components: java
> Affects Versions: 1.11.2, 1.11.3
> Reporter: Thun Hak
> Priority: Minor
>
> consider the this invoice.avdl:
> {code:java}
> @namespace("purchase")
> protocol InvoiceProtocol{
> import idl "../customer/data.avdl";
> import idl "../product/data.avdl";
> record Invoice{
> timestamp_ms purchaseTime;
> customer.Customer customer;
> array<product.Item> items = [];
> }
> }{code}
> now, if I do this:
> {code:java}
> for(Schema s : protocol.getTypes()){
> System.out.println(s.getName() + " " + s.isImported());
> }{code}
> this should produce
> {code:java}
> Customer true
> Item true
> Invoice false{code}
> I have some use case where we want to perform operations only to the "main"
> schema and leave all imported schema alone.
> This feature (s.isImported()) would be really nice to have (if it's not
> available already).
> Thanks
--
This message was sent by Atlassian Jira
(v8.20.10#820010)