[julia-users] Re: Getting sequence of time and subset it

2016-06-01 Thread Evan Fields
Of course there's a way :)

You can use the isna function to check if a value is NA or not. There's 
also the dropna function which takes a DataArray as input and returns a 
regular Vector with the NA elements removed.

You could try something like the following:

firstbreakcol = 4
lastbreakcol = 5
for i in 1:nrow(sdt2), t in 
sdt2[:StartTime][i]:Dates.Minute(30):sdt2[:EndTime][i]
shouldprint = true
for j in firstbreakcol:lastbreakcol
if !isna(sdt2[i,j]) && sdt2[i,j] == t
# t is in a break column
shouldprint = false
break
end
end
if shouldprint
println(t)
end
end

It's a little verbose, true. I'm sure there's something cleaner with 
dropna, filter, etc.


[julia-users] Re: Getting sequence of time and subset it

2016-05-31 Thread akrun
Many thanks for the solution and it works well with the example posted.   
But, if I have missing values in one of the columns, it throws an error.

   sdt2 = DataFrame(ID = 1:2, StartTime = DateTime(["4/13/2016 07:00", 
"4/13/2016 07:15"], "m/d/y H:M"),
   EndTime = DateTime(["4/13/2016 12:00", 
"4/13/2016 12:15"], "m/d/y H:M"),
   BreakTime1 = DateTime([NA, "4/13/2016 10:15"], 
"m/d/y H:M"),
   BreakTime2 = DateTime(["4/13/2016 12:00", 
"4/13/2016 12:15"], "m/d/y H:M")


  d1= DataFrame(id=Int64[], Time = AbstractString[])
for i in 1:nrow(sdt2), t in 
[sdt2[:StartTime][i]:Dates.Minute(30):sdt2[:EndTime][i]]
  if !(t in [sdt2[:BreakTime1][i], sdt2[:BreakTime2][i]])
  push!(d1, [i, Dates.format(t, "HH:MM")])
  end
   end

   MethodError: `convert` has no method matching convert(::Type{Int64}, 
::Array{Any,1})
This may have arisen from a call to the constructor Int64(...),
since type constructors fall back to convert methods.
Closest candidates are:
  call{T}(::Type{T}, ::Any)
  convert(::Type{Int64}, !Matched::Int8)
  convert(::Type{Int64}, !Matched::UInt8)

Is there any way to correct this?

Thanks.


 On Monday, May 30, 2016 at 9:22:24 PM UTC+5:30, Evan Fields wrote:
>
> I'm not 100% sure I understand your question, but let me give it a shot. 
>
> First thing is to note why you're getting that MethodError. It's from the 
> line
>
> println(si in sdt1[i, [:BreakTime1, :BreakTime2]])
>
> Define sdt1 as you do (I just copied into a Julia REPL) and set i = 1 to 
> for the first iteration of the for loop.
>
> Then:
>
> julia> sdt1[i, [:BreakTime1, :BreakTime2]]
> 1×2 DataFrames.DataFrame
> │ Row │ BreakTime1  │ BreakTime2  │
> ├─┼─┼─┤
> │ 1   │ 2016-04-13T10:00:00 │ 2016-04-13T12:00:00 │
>
> julia> si = 
> Dates.format([sdt1[i,:StartTime]:Dates.Minute(30):sdt1[i,:EndTime]], 
> "HH:MM")
> WARNING: [a] concatenation is deprecated; use collect(a) instead
>  in depwarn at deprecated.jl:73
>  in oldstyle_vcat_warning at abstractarray.jl:29
>  in vect at abstractarray.jl:32
> while loading no file, in expression starting on line 0
> 11-element Array{Any,1}:
>  "07:15"
>  "07:45"
>  "08:15"
>  "08:45"
>  "09:15"
>  "09:45"
>  "10:15"
>  "10:45"
>  "11:15"
>  "11:45"
>  "12:15"
>
> julia> si in sdt1[i, [:BreakTime1, :BreakTime2]]
> ERROR: MethodError: `start` has no method matching 
> start(::DataFrames.DataFrame)
>  in mapreduce_sc_impl at reduce.jl:194
>  in in at reduce.jl:377
>
> So si is of type Vector{Any} but happens to hold strings. sdt1[i, 
> [:BreakTime1, :BreakTime2]] is a DataFrame. What does it mean to ask if a 
> vector is in a DataFrame? That's what's happening in your println; the `in` 
> is just an infix operator of function in.
>
> 3-element Array{Int64,1}:
>  1
>  2
>  3
>
> julia> 1 in x
> true
>
> julia> in(1,x)
> true
>
> julia> in(5,x)
> false
>
> I assume you get the MethodError of no method start because in(a,b) 
> iterates over b, checking each element of b for equality to a. But 
> DataFrames are not iterable in this way. (This part just a guess; look it 
> up in the manual before telling your friends.)
>
> Anyway, to get output as from your R snippet, you could use a loop like 
> the following:
>
> julia> for i in 1:nrow(sdt1), t in 
> sdt1[:StartTime][i]:Dates.Minute(30):sdt1[:EndTime][i]
>if !(t in [sdt1[:BreakTime1][i], sdt1[:BreakTime2][i]])
>println("$i $(Dates.format(t, "HH:MM"))")
>end
>end
> 1 07:00
> 1 07:30
> 1 08:00
> 1 08:30
> 1 09:00
> 1 09:30
> 1 10:30
> 1 11:00
> 1 11:30
> 2 07:15
> 2 07:45
> 2 08:15
> 2 08:45
> 2 09:15
> 2 09:45
> 2 10:45
> 2 11:15
> 2 11:45
>
> I hope this helps!
>


[julia-users] Re: Getting sequence of time and subset it

2016-05-30 Thread Evan Fields
I'm not 100% sure I understand your question, but let me give it a shot. 

First thing is to note why you're getting that MethodError. It's from the 
line

println(si in sdt1[i, [:BreakTime1, :BreakTime2]])

Define sdt1 as you do (I just copied into a Julia REPL) and set i = 1 to 
for the first iteration of the for loop.

Then:

julia> sdt1[i, [:BreakTime1, :BreakTime2]]
1×2 DataFrames.DataFrame
│ Row │ BreakTime1  │ BreakTime2  │
├─┼─┼─┤
│ 1   │ 2016-04-13T10:00:00 │ 2016-04-13T12:00:00 │

julia> si = 
Dates.format([sdt1[i,:StartTime]:Dates.Minute(30):sdt1[i,:EndTime]], 
"HH:MM")
WARNING: [a] concatenation is deprecated; use collect(a) instead
 in depwarn at deprecated.jl:73
 in oldstyle_vcat_warning at abstractarray.jl:29
 in vect at abstractarray.jl:32
while loading no file, in expression starting on line 0
11-element Array{Any,1}:
 "07:15"
 "07:45"
 "08:15"
 "08:45"
 "09:15"
 "09:45"
 "10:15"
 "10:45"
 "11:15"
 "11:45"
 "12:15"

julia> si in sdt1[i, [:BreakTime1, :BreakTime2]]
ERROR: MethodError: `start` has no method matching 
start(::DataFrames.DataFrame)
 in mapreduce_sc_impl at reduce.jl:194
 in in at reduce.jl:377

So si is of type Vector{Any} but happens to hold strings. sdt1[i, 
[:BreakTime1, :BreakTime2]] is a DataFrame. What does it mean to ask if a 
vector is in a DataFrame? That's what's happening in your println; the `in` 
is just an infix operator of function in.

3-element Array{Int64,1}:
 1
 2
 3

julia> 1 in x
true

julia> in(1,x)
true

julia> in(5,x)
false

I assume you get the MethodError of no method start because in(a,b) 
iterates over b, checking each element of b for equality to a. But 
DataFrames are not iterable in this way. (This part just a guess; look it 
up in the manual before telling your friends.)

Anyway, to get output as from your R snippet, you could use a loop like the 
following:

julia> for i in 1:nrow(sdt1), t in 
sdt1[:StartTime][i]:Dates.Minute(30):sdt1[:EndTime][i]
   if !(t in [sdt1[:BreakTime1][i], sdt1[:BreakTime2][i]])
   println("$i $(Dates.format(t, "HH:MM"))")
   end
   end
1 07:00
1 07:30
1 08:00
1 08:30
1 09:00
1 09:30
1 10:30
1 11:00
1 11:30
2 07:15
2 07:45
2 08:15
2 08:45
2 09:15
2 09:45
2 10:45
2 11:15
2 11:45

I hope this helps!