Spoke too soon. Again I simple want the CSV column that is read in to not be an int32, but a string.
Still having issues casting the CSV file back into a Dataframe. Its hard to understand why the Julia system is attempting to determine the type of the columns when I use readtable and I have no control over this. Why can I not say: df1 = readtable(file1; types=Dict(1=>String)) # assuming your account number is column # 1 *Reading the Julia spec-Advanced Options for Reading CSV Files* *readtable accepts the following optional keyword arguments:* *eltypes::Vector{DataType} – Specify the types of all columns. Defaults to [].* *df1 = readtable(file1, Int32::Vector(String))* I get *ERROR: TypeError: typeassert: expected Array{String,1}, got Type{Int32}* Is this even an option? Or how about convert the df1_CSV to df1_dataframe? *df1_dataframe = convert(dataframe, df1_CSV)* Since the CSV .read seems to give more granular control. On Tuesday, November 1, 2016 at 7:28:36 PM UTC-4, LeAnthony Mathews wrote: > > Great, that worked for forcing the column into a string type. > Thanks > > On Monday, October 31, 2016 at 3:26:14 PM UTC-4, Jacob Quinn wrote: >> >> You could use CSV.jl: http://juliadata.github.io/CSV.jl/stable/ >> >> In this case, you'd do: >> >> df1 = CSV.read(file1; types=Dict(1=>String)) # assuming your account >> number is column # 1 >> df2 = CSV.read(file2; types=Dict(1=>String)) >> >> -Jacob >> >> >> On Mon, Oct 31, 2016 at 12:50 PM, LeAnthony Mathews <leant...@gmail.com> >> wrote: >> >>> Using v0.5.0 >>> I have two different 10,000 line CSV files that I am reading into two >>> different dataframe variables using the readtable function. >>> Each table has in common a ten digit account_number that I would like to >>> use as an index and join into one master file. >>> >>> Here is the account number example in the original CSV from file1: >>> 8018884596 >>> 8018893530 >>> 8018909633 >>> >>> When I do a readtable of this CSV into file1 then do a* >>> typeof(file1[:account_number])* I get: >>> *DataArrays.DataArray(Int32,1)* >>> -571049996 >>> -571041062 >>> -571024959 >>> >>> when I do a >>> *typeof(file2[:account_number])* >>> *DataArrays.DataArray(String,1)* >>> >>> >>> *Question: * >>> My CSV files give no guidance that account_number should be Int32 or >>> string type. How do I force it to make both account_number elements type >>> String? >>> >>> I would like this join command to work: >>> *new_account_join = join(file1, file2, on =:account_number,kind = :left)* >>> >>> But I am getting this error: >>> *ERROR: TypeError: typeassert: expected Union{Array{Symbol,1},Symbol}, >>> got Array{* >>> *Array{Symbol,1},1}* >>> * in (::Base.#kw##join)(::Array{Any,1}, ::Base.#join, >>> ::DataFrames.DataFrame, ::D* >>> *ataFrames.DataFrame) at .\<missing>:0* >>> >>> >>> Any help would be appreciated. >>> >>> >>> >>