Andrew Davidson created SPARK-12483: ---------------------------------------
Summary: Data Frame as() does not work in Java Key: SPARK-12483 URL: https://issues.apache.org/jira/browse/SPARK-12483 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.5.2 Environment: Mac El Cap 10.11.2 Java 8 Reporter: Andrew Davidson Following unit test demonstrates a bug in as(). The column name for aliasDF was not changed @Test public void bugDataFrameAsTest() { DataFrame df = createData(); df.printSchema(); df.show(); DataFrame aliasDF = df.select("id").as("UUID"); aliasDF.printSchema(); aliasDF.show(); } DataFrame createData() { Features f1 = new Features(1, category1); Features f2 = new Features(2, category2); ArrayList<Features> data = new ArrayList<Features>(2); data.add(f1); data.add(f2); //JavaRDD<Features> rdd = javaSparkContext.parallelize(Arrays.asList(f1, f2)); JavaRDD<Features> rdd = javaSparkContext.parallelize(data); DataFrame df = sqlContext.createDataFrame(rdd, Features.class); return df; } This is the output I got (without the spark log msgs) root |-- id: integer (nullable = false) |-- labelStr: string (nullable = true) +---+------------+ | id| labelStr| +---+------------+ | 1| noise| | 2|questionable| +---+------------+ root |-- id: integer (nullable = false) +---+ | id| +---+ | 1| | 2| +---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org