On Fri, Aug 19, 2016 at 11:37 AM, Stack <st...@duboce.net> wrote: > On Thu, Aug 18, 2016 at 5:54 PM, James Taylor <jamestay...@apache.org> > wrote: > > > The data loaded fine for us. > > > Mind describing what you did to get it to work and with what versions and > configurations and with what TPC loading and how much of the workload was > supported? Was it a one-off project? >
Mujtaba already kindly responded to this (about a week back on this thread). He was able to load the data for the benchmark onto one of our internal clusters. He didn't run the benchmarks. Sorry, but I don't have any more specific knowledge, but generally I think: - it's difficult for an OS project to troubleshoot environmental issues and it's even more difficult if a user is using a vendor specific distro. IMHO, if you ask an open source project for help, you should be using the artifacts that they produce (preferably the latest release). - using a three node cluster for HBase is not ideal for benchmarking. - doing full table scans over large HBase tables will be slow. > > > > > If TPC is not representative of real > > workloads, I'm not sure there's value in spending a lot of time running > > them. > > > I suppose the project could just ignore TPC but I'd suggest that Phoenix > put up a page explaining why TPC does not apply if this the case; i.e. it > is not representative of Phoenix work loads. When people see that Phoenix > is for "OLTP and analytical queries", they probably think the TPC loadings > will just work given their standing in the industry. Putting up a disavowal > with explanation will save folks time trying to make it work and it can > also be cited when folks try to run TPC against Phoenix and they have a bad > experience, say bad performance. > I haven't run the TPC benchmarks, so I have no idea how they perform. I work at Salesforce where we use Phoenix (among may other technologies) to support various big data use cases. The workloads I'm familiar with aren't similar to the TPC benchmarks, so they're not relevant for my work. But if TPC benchmarks are relevant for your work, then that'd be great if you pursued this. Or maybe we can get this "Phoenix" person you mentioned to do it (smile). > > On the other hand, even if an artificial loading, unless Phoenix has a > better means of verifying all works, I'd think it would be a useful test to > run before release or on a nightly basis verifying no regression in > performance or in utility. > I think the community would welcome enhancing our existing regression test suite. If you're up for leading that effort, that'd be great. Thanks, James