Sounds like a good normalization of the cross-product of parsers and data sources. One nit: Why "generic" as the name? (Silly detail, I know; just seems a little too, um, generic.) Could you list out what the supported set of feeds will be - the full set - including the semi-supported ones (Condor, Twitter push, Twitter pull) - and show what the CREATE FEED statements will be for all of this? I'm not sure the community has a clear picture of what we currently have. I vote to take this opportunity to really clean this up and then advertise improved feed support on the bill of materials for the next possible release!

On 4/28/17 10:17 AM, abdullah alamoudi wrote:
Hi Devs,
Here is a bit of history. When external data access was introduced to 
asterixdb, we had so many adapters. Each adapter was a self contained piece in 
charge of fetching and parsing data. Each adapter had an alias (hdfs, localfs, 
twitter, socket, etc)
This lead to a lot of duplicate code and to remove duplication, we created a 
generic adapter which consists of a pluggable data source and a pluggable data 
parser. we replaced all of those old adapters with a data source that can be 
plugged into the generic adapter.

We lost the adapters and their aliases, so a statement like using hdfs(....) 
would fail because the hdfs adapter is not there anymore. We didn't want to 
change the syntax and wanted it to keep working. So in such a case, if the 
adapter was not found, we would use the generic adapter
and assume the hdfs is the data source parameter. In that sense, the adapter 
name became a parameter outside the pairs of key, value list of parameters.

This was fine for a while but as external data evolves and as we attempt to 
make the codebase cleaner and more maintainable, we are having to deal with 
more nuances working around this compatibility issue.
We would like to propose a change that moves the datasource parameter inside 
the key value pair. For example:

using hdfs(...) would become using generic("datasource"="hdfs")
using localfs(...) would become using generic("datasource"="localfs")

This would allow us to have a cleaner code under the hood. we would update the 
test cases and the documentation. If anybody has an objection or a thought, 
then let us know.

Cheers,
Abdullah.

Reply via email to