Re: Column width limits?

2014-08-06 Thread Mayur Rustagi
Spark breaks data across machines at partition level, so realistic limit is on the partition size. Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Thu, Aug 7, 2014 at 8:41 AM, Daniel, Ronald (ELS-SDG) < r.dan...@elsevier

Column width limits?

2014-08-06 Thread Daniel, Ronald (ELS-SDG)
Assume I want to make a PairRDD whose keys are S3 URLs and whose values are Strings holding the contents of those (UTF-8) files, but NOT split into lines. Are there length limits on those files/Strings? 1 MB? 16 MB? 4 GB? 1 TB? Similarly, can such a thing be registered as a table so that I can us