Hello all, I'm looking for a reference architecture for hadoop. The only result I found is Lambda architecture from Nathan Marz[0].
With architecture I mean answers to question like: - How should I store the data? CSV, Thirft, ProtoBuf - How should I model the data? ER-Model, Starschema, something new? - normalized or denormalized or both (master data normalized, then transformation to denormalized, like ETL) - How should i combine database and HDFS-Files? Are there any other documented architectures for hadoop? Regards Daniel Käfer [0] http://www.manning.com/marz/ just a preprint yet, not completed
