-----Original Message-----
From: Thiago Macieira <[email protected]> 
Sent: jeudi 14 juin 2018 02:08

> This email is to start that discussion and answer these questions:
>   a) should we have this API?
>   b) if so, what would this API look like?
>   c) if not, should we unify at least JSON and CBOR?
>   c) either way, should remove QCborValue until we have it?
...
> This API would also be the replacement of the JSON and CBOR parsers, for 
> which 
> we'd have a unified API, something like:
>   QFoo json = QJson::fromJson(jcontent);
>   QFoo cbor = QCbor::fromCbor(ccontent);
>   qDebug() << QCbor::toDiagnosticFormat(json);        // yes, json

Hi all,

As I was saying during QtCS "QDebug and others" session, structured traces need 
a way to serialize in-memory data structures in a number of formats for 
interoperability reasons. This requires a generic way to traverse data 
structures, 
not necessarily a generic data structure.

A common data structure for Cbor and Json makes sense since they share so much. 
But even these 2 formats have peculiarities that may need distinct APIs like 
Cbor 
tags. This is even more true for Xml. I think that Cbor found a sweet spot 
between 
generality and efficiency, so the data structure behind QCborValue will be good 
for many use cases. But like as a generic data structure, it will not suit all 
use 
cases. Look at Python: although it has general purpose list, dict, and so on, 
its
Json decoder provides hooks to bind data to specialized types.

So, I think it is perfectly Ok to have a common data structure for Cbor and 
Json 
but it does not preclude the need for specific APIs. Also, specific 
documentation 
is usually easier to understand than generic one since you can refer to Json 
and Cbor 
standards and examples. I think it is also Ok to have QCborValue in 5.12 
because 
we can always add a more generic API as a thin layer on top of specialized data 
structures, especially in Qt6 if we take advantage of C++17.

Since the title of the discussion is so general, please let me sketch here what 
such API could be. That may help find a definitive answer to your questions.

The basic existing API for reading/writing is streams. One problem with streams 
is that the structure of the data being traversed is lost. So, a reader must 
know 
the structure to read the data. And in some cases, ambiguities may even prevent 
from reading back the data:

    cout << 1.0 << 1;
    cin >> myFloat >> myInt; // may well read myFloat==1.1, myInt==0

QDebug avoids most problems by inserting spaces between << by default but does 
not 
allow reading. Also, a user-defined type T must write slightly different code 
for writing 
in QDebug, and other formats, and for reading the resulting text...

The approach we took in the MODMED project originates in functional Zippers 
which
are generalized iterators for arbitrary data structures, not just sequences. It 
makes the data structure apparent in the traversal. Also, the traversal can be 
adapted 
to the task at end. For instance, a user-defined type may ignore some Json data 
it 
does not understand while reading. Thus, the approach, allows to bind "any data 
with a common structure" such as a generic QCborValue and a user-defined type 
or 
a QByteArray containing Cbor data or utf8 encoded Json.

Let me dream what this approach could look like in Qt, by first using the 
approach
to directly write some usual data types in Cbor or Json:

    QVector<double> vector = {1.,2.};
    QByteArray buffer;
    QCborWriter cborw(&buffer);
    cborw.sequence().bind("val").bind(true).bind(vector); 
    // buffer = 0x9F6376616...

Note: A generic encoder would use Cbor indefinite length arrays and few or no 
Cbor tags
so a specialized encoder would still be needed for some use cases.

    buffer.clear();
    QJsonWriter jsonw(&buffer);
    jsonw.sequence().bind("val").bind(true).bind(vector); // same code as above
    // buffer = ["val",true,[1.0,2.0]]

In our approach, "bind" handles Read and Write the same way, so it is possible 
to do:

    QString val; bool b;
    QJsonReader jsonr(&buffer);
    jsonr.sequence().bind(val).bind(b).bind(vector); // same code with lvalues
    // val = "val", b = true, ...

This can work with any in-memory data type like QMap, QVector or QCborValue.
It just requires a bind method or QBind<TResult,T> functor definition. Let me 
show 
you the default templated QBind definition:

    template<class TResult, typename T>
    struct QBind {
        static TResult bind(Val<TResult> value, T t) {
            return t.bind(value); // In case of error, define a 
T::bind(Val<TResult>) method or an external 
QBind<TResult,T>::bind(Val<TResult>,T) functor
        }
    };

Most user-defined bind methods would be very simple and the type system would 
guarantee that data is well-formed (no sequence or record left open...):

    struct Person {
        QString m_firstName, m_lastName;
        int m_age;
    
        template<TResult>
        TResult bind(Val<TResult> value) { return value
            .record()
                .sequence("name")
                    .bind(m_firstName)
                    .bind(m_lastName)
                .out()
                .bind("age" , m_age); // automagically closes opened record
        }
    };

One advantage of the approach is that such boiler-plate code would have to be 
written
once for any TResult (be it a QJsonReader, QCborWriter, etc.), so the above 
code 
would be enough to allow:

    QByteArray json; 
    QJsonWriter(&json) jsonw; jsonw.bind(Person {"John","Doe",42}); // json = 
{"name":["John","Doe"],"age":42}
    Person p;
    QJsonReader(&json) jsonr; jsonr.bind(p); // p = Person {"John","Doe",42}
    QByteArray cbor; 
    QCborWriter(&cbor) cborw; cborw.bind(p); // cbor = 0xBF646E616D659F64...

Note: Dynamic data structures' bind methods need to handle Write and Read 
differently
but user-defined types are rarely dynamically-sized.

The approach even works with QIODevice and no intermediate in-memory data, 
so it is possible to do:

    QIODevice in, out;
    // open appropriately
    QJsonReader(&in ) jsonr;
    QCborWriter(&out) cborw;
    if (cborw.bind(jsonr)) cout << "Done."; 
    // transforms any Json to Cbor without loading everything in memory

To sum up:
* this approach can use QCborValue which is a nice balance between generality
and efficiency that provides in-place editing
* QBind could provide QCborValue (or any other data type) generic read/write to 
a 
number of formats, but...
* specific writers/readers may always be necessary
* QCborValue may differ from QJsonValue at one time to handle Cbor tags and 
other peculiarities

To move on, we have a working structured traces library implementing this 
approach.
However, its write performance is 10 times that of boiler-plate code using 
QDebug. 
Based on our previous work and using modern C++ compilers, it seems possible to 
implement the approach with more reasonable write performance. So, I will try 
to 
submit a proof of concept in the following days.

In the meanwhile, I've put a few details on our approach and links to related 
discussions on the "QDebug" session wiki page:
https://wiki.qt.io/QDebug_and_other_tracing_facilities

Hope it helps,
Arnaud
_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Reply via email to