Sounds like a good normalization of the cross-product of parsers and
data sources. One nit: Why "generic" as the name? (Silly detail, I
know; just seems a little too, um, generic.) Could you list out what
the supported set of feeds will be - the full set - including the
semi-supported ones (Condor, Twitter push, Twitter pull) - and show what
the CREATE FEED statements will be for all of this? I'm not sure the
community has a clear picture of what we currently have. I vote to take
this opportunity to really clean this up and then advertise improved
feed support on the bill of materials for the next possible release!
On 4/28/17 10:17 AM, abdullah alamoudi wrote:
Hi Devs,
Here is a bit of history. When external data access was introduced to
asterixdb, we had so many adapters. Each adapter was a self contained piece in
charge of fetching and parsing data. Each adapter had an alias (hdfs, localfs,
twitter, socket, etc)
This lead to a lot of duplicate code and to remove duplication, we created a
generic adapter which consists of a pluggable data source and a pluggable data
parser. we replaced all of those old adapters with a data source that can be
plugged into the generic adapter.
We lost the adapters and their aliases, so a statement like using hdfs(....)
would fail because the hdfs adapter is not there anymore. We didn't want to
change the syntax and wanted it to keep working. So in such a case, if the
adapter was not found, we would use the generic adapter
and assume the hdfs is the data source parameter. In that sense, the adapter
name became a parameter outside the pairs of key, value list of parameters.
This was fine for a while but as external data evolves and as we attempt to
make the codebase cleaner and more maintainable, we are having to deal with
more nuances working around this compatibility issue.
We would like to propose a change that moves the datasource parameter inside
the key value pair. For example:
using hdfs(...) would become using generic("datasource"="hdfs")
using localfs(...) would become using generic("datasource"="localfs")
This would allow us to have a cleaner code under the hood. we would update the
test cases and the documentation. If anybody has an objection or a thought,
then let us know.
Cheers,
Abdullah.