Re: Change column values using several when conditions
you can check if the value exists by using distinct before you loop over the dataset. man. 1. mai 2023 kl. 10:38 skrev marc nicole : > Hello > > I want to change values of a column in a dataset according to a mapping > list that maps original values of that column to other new values. Each > element of the list (colMappingValues) is a string that separates the > original values from the new values using a ";". > > So for a given column (in the following example colName), I do the > following processing to alter the column values as described: > > for (i=0;i> >> //below lists contains all distinct values of a column >> (colMappingValues[i]) and their target values) >> allValuesChanges = colMappingValues[i].toString().split(";", 2); >> >> dataset = dataset.withColumn(colName, >> when(dataset.col(colName).equalTo(allValuesChanges[0])),allValuesChanges[1]).otherwise(dataset.col(colName)); > > } > > which is working but I want it to be efficient to avoid unnecessary > iterations. Meaning that I want when the column doesn't contain the value > from the list, the call to withColumn() gets ignored. > How to do exactly that in a more efficient way using Spark in Java? > > > -- Bjørn Jørgensen Vestre Aspehaug 4, 6010 Ålesund Norge +47 480 94 297
Change column values using several when conditions
Hello I want to change values of a column in a dataset according to a mapping list that maps original values of that column to other new values. Each element of the list (colMappingValues) is a string that separates the original values from the new values using a ";". So for a given column (in the following example colName), I do the following processing to alter the column values as described: for (i=0;i > //below lists contains all distinct values of a column > (colMappingValues[i]) and their target values) > allValuesChanges = colMappingValues[i].toString().split(";", 2); > > dataset = dataset.withColumn(colName, > when(dataset.col(colName).equalTo(allValuesChanges[0])),allValuesChanges[1]).otherwise(dataset.col(colName)); } which is working but I want it to be efficient to avoid unnecessary iterations. Meaning that I want when the column doesn't contain the value from the list, the call to withColumn() gets ignored. How to do exactly that in a more efficient way using Spark in Java?
How to change column values using several when conditions ?
Hello to you Sparkling community :) I want to change values of a column in a dataset according to a mapping list that maps original values of that column to other new values. Each element of the list (colMappingValues) is a string that separates the original values from the new values using a ";". So for a given column (in the following example colName), I do the following processing to alter the column values as described: for (i=0;i > //below lists contains all distinct values of a column > (colMappingValues[i]) and their target values) > allValuesChanges = colMappingValues[i].toString().split(";", 2); > > dataset = dataset.withColumn(colName, > when(dataset.col(colName).equalTo(allValuesChanges[0])),allValuesChanges[1]).otherwise(dataset.col(colName)); } which is working but I want it to be efficient to avoid unnecessary iterations. Meaning that I want when the column doesn't contain the value from the list, the call to withColumn() gets ignored. How to do exactly that in a more efficient way using Spark in Java? Thanks.