Question: I need the outer join of "crawl_fetch" and "content" as input to a map-reduce job I'm writing, in order to access the *fetch time* and *fetch status* alongside the fetched content. I'd like to use the `org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat` for this task, but it states in the documentation that it is "capable of performing joins over a set of data sources sorted and partitioned the same way". I know that "crawl_fetch" and "content" use the same key (the url), but do I have any sort of guarantee that they are "sorted and partitioned the same way"? Using the latest 1.15 Nutch version from Github.

