since a dataset is a typed object you ideally don't have to think about field names.
however there are operations on Dataset that require you to provide a Column, like for example joinWith (and joinWith returns a strongly typed Dataset, not DataFrame). once you have to provide a Column you are back to thinking in field names, and worrying about duplicate field names, which is something that can easily happen in a Dataset without you realizing it. so under the hood Dataset has unique identifiers for every column, as in dataset.queryExecution.logical.output, but these are expressions (attributes) that i cannot turn back into columns since the constructors for this are private in spark. so.... how about having Dataset.apply(i: Int): Column to allow me to pick columns by position without having to worry about (duplicate) field names? then i could do something like: dataset.joinWith(otherDataset, dataset(0) === otherDataset(0), joinType)