Dear Julia,
Sorry I made a very raw answer not very detailed. But you are correct:
TB$Date <- as.POSIXct(strptime(TB$DateTime, "%d.%m.%Y %H:%M:%S", tz =
"GMT"))
you created a new column (Date POSIXct),
Exactly. If you want, you can also replace directly '$DateTime'. But
'$Date' is a common designation for a vector of dates (with or without
time).
but what about the second step?
TB2 <- TB[!duplicated(paste(TB$Date, TB$Bird)), ]
Is this to create a new file without any duplicates?
This is indeed to create a new data frame with only the non-duplicated
dates per individual. Let me explain in details:
paste(TB$Date, TB$Bird)
creates a temporary vector with both the date and the id pasted in it
(i.e. '2010-02-11 17:08:34 B18' if 'B18' is one id). This is where you
need to look at duplicates, since you don't want to remove relocations
of different individuals at the exact same time. You check the
duplicates with:
duplicated(paste(TB$Date, TB$Bird))
It returns a logical vector (TRUE/FALSE, with TRUE when the element is
duplicated). To select only the not-duplicated elements, use the '!'
operator ("not"), and because you apply it to the rows of the data frame
TB, and you still want every column, it becomes:
TB[!duplicated(paste(TB$Date, TB$Bird)), ]
I just stored it into 'TB2' so that you can look at the differences
between TB and TB2.
I'm still not sure if I understand where my duplicates come from? And, if
deleted, I don't miss any important data?
This is a difficult question... The duplicates may come from prior
changes on the data. Did you work on the raw data directly? They may
also come from technical glitches which sometimes happen with the GPS
technology. You can check whether you have a lot of duplicates or not:
sum(duplicated(paste(TB$Date, TB$Bird)))/nrow(TB)
will give you the proportion of duplicates. If the proportion is high
(say > a few %), I suggest you check manually the raw data, and then the
material. For you second question, it also depend on this: if you have
only minor duplicates (say < 1%), it won't change anything at all.
Hope this helps,
Mathieu.
So far, thank you heaps.
Best regards,
Julia
On 14/08/2010, at 9:11 , Mathieu Basille wrote:
[I repost this message to the list since I used the wrong e-mail
address...]
Dear Julia,
You can probably simplify Tyler's approach (who is totally right about
the highlighted problems) to the following:
TB <- as.data.frame(read.table("GPS_2009_2010R.csv", header = TRUE,
sep = ','))
TB$Date <- as.POSIXct(strptime(TB$DateTime, "%d.%m.%Y %H:%M:%S", tz =
"GMT"))
TB2 <- TB[!duplicated(paste(TB$Date, TB$Bird)), ]
## You can also store it directly into TB, but that way you can still
## check the original data that remains unchanged
tr <- as.ltraj(TB2[, c("LON","LAT")], TB2$Date, TB2$Bird)
How does that work?
Sincerely,
Mathieu.
Le 14/08/2010 10:31, Tyler Dean Rudolph a écrit :
Hi Julia,
You have two issues here: one is why do you have so many duplicated
observations? This may be in part due to having different birds sampled
at the same moment in time....
*> TB$gmt = TB$gmt[-dupz]*
Error in `$<-.data.frame`(`*tmp*`, "gmt", value = c(1265908114,
1265908416, :
replacement has 22079 rows, data has 22274
The reason you are getting this error is that you cannot reduce the
length of one column alone in a data.frame and keep the others at their
original length. You can, however, remove all associated observations
period:
TB = TB[-dupz,]
This works because every column has the same number of elements, and
since you know which elements of TB$gmt are duplicated you can use that
as an index of which rows you no longer want in the data frame.
But you are probably more interested in identifying duplicated date/time
values within the set of observed values for each unique bird, is that
not correct? In this case there is an extra step, albeit an important
one:
# First you split your data frame into a list by unique bird
splitdata = split(TB, TB$Bird, drop=TRUE)
Note the drop=TRUE argument is in case you have levels of a factor with
no observations - which can happen - therefore you choose to drop those
levels (although you can still keep them if you have some compelling
reason, but it may complicate the results).
# Next you identify duplicates within each unique bird data set.
splitdupz = lapply(splitdata, function(birdup)
which(duplicated(birdup$gmt)))
str(splitdupz)
# If you've had a look and you've decided to remove them you can do it
this way
splitdata.nodupz = lapply(splitdata, function(birdup) birdup =
birdup[-which(duplicated(birdup$gmt))),])
# Now you can recompile your data frame with the duplicates from your
unique Bird data sets removed.
TB2 = do.call(rbind, splitdata.nodupz)
I don't have any of your sample data to work with so I can't verify this
has been debugged, but it should give you a pretty good head start.
When a create a ltraj for only one individual (eg. bird 18) it works
fine: tr18<-as.ltraj(xy[TB$Bird=="18",], date=TB$gmt[TB$Bird=="18"],
id="18")
One last thing to note is I'm not sure your id argument in the above
call should be "18"; it may be designed to work that way but I would
prefer to go with TB$Bird[TB$Bird=="18"], which will give you a vector
of values == "18" the length of nrow(TB).
Tyler
On 14/08/2010, at 3:04 , Tyler Dean Rudolph wrote:
I presume your date/time values are of class POSIXct and do include
times (if they don't adehabitat will think multiple observations on a
given day are duplicates)? If that's all good find out which POSIXct
values are repeated:
dupz = which(duplicated(mydates))
mydates = mydates[-dupz]
Tyler
On 2010-08-14, at 8:49, Julia Sommerfeld <
<mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>> wrote:
Hello,
Thank you for yoir fast reply. I have indeed overlapping dates, i.e.
several birds have the same start and end date. But I'm not sure
how to overcome this problem? Couldn't find anything in adehabitat.
Cheers
Julia
On 14/08/2010, at 2:38 , Tyler Dean Rudolph wrote:
your problem seems to be that you have duplicate (i.e. more than
one identical) dates for a given burst. I believe there are
arguments in adehabitat to check for this; if not you will need to
verify this first.
On 2010-08-14, at 8:16, Julia Sommerfeld <
<mailto:[email protected]> <mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>> wrote:
Dear All,
I would like to create an object of class "ltraj" for all GPS
positions (id=Bird) with time, but I always get the following
error message:
*Error in as.ltraj(xy, TB$gmt, id, burst, typeII = TRUE, slsp =
c("remove", : *
* non unique dates for a given burst*
Do I need to create a burst? Could anyone tell me what the exact
difference between "id" and "burst" in as.ltraj is?
When a create a ltraj for only one individual (eg. bird 18) it
works fine: tr18<-as.ltraj(xy[TB$Bird=="18",],
date=TB$gmt[TB$Bird=="18"], id="18")
But it doesn't work for all animals...
My script so far:
library(adehabitat)
library(ade4)
library(gpclib)
library(trip)
library(maps)
library(mapdata)
library(fields)
TB<-as.data.frame(read.table("GPS_2009_2010R.csv",header=T,sep=','))
xy<-TB[,c("LON","LAT")]
id<-TB[,c("Bird")]
### Conversion of the date to the format POSIX
gmt<-(TB$DateTime)
TB$gmt<-as.POSIXct(strptime(gmt, "%d.%m.%Y %H:%M:%S", tz="GMT"))
### Creation of the object of class "ltraj" for GPS positions with
time
tr<-as.ltraj(xy, TB$gmt, id, typeII=TRUE, slsp = c("remove",
"missing"))
My csv.file contains the following columns:
*Bird *(ID)
*DateTime* (date and time in one column)
*LON* (longitude)
*LAT* (latitude
Thank you very much.
Best regards,
Julia
_______________________________________________
AniMov mailing list
<mailto:[email protected]> <mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
_______________________________________________
AniMov mailing list
<mailto:[email protected]> <mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
Julia Sommerfeld - PhD Candidate
Institute for Marine and Antarctic Studies
University of Tasmania
Private Bag 129, Hobart
TAS 7001
Phone: +61 458 247 348
Email: <mailto:[email protected]> <mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
<mailto:[email protected]>
<mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
_______________________________________________
AniMov mailing list
<mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>
<http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov>http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
_______________________________________________
AniMov mailing list
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
Julia Sommerfeld - PhD Candidate
Institute for Marine and Antarctic Studies
University of Tasmania
Private Bag 129, Hobart
TAS 7001
Phone: +61 458 247 348
Email: <mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
<mailto:[email protected]>[email protected]
<mailto:[email protected]>
<mailto:[email protected]>
_______________________________________________
AniMov mailing list
[email protected] <mailto:[email protected]>
http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
_______________________________________________
AniMov mailing list
[email protected] <mailto:[email protected]>
http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
--
~$ whoami
Mathieu Basille, Post-Doc
~$ locate
Laboratoire d'Écologie Comportementale et de Conservation de la Faune
+ Centre d'Étude de la Forêt
Département de Biologie
Université Laval, Québec
~$ info
http://ase-research.org/basille
~$ fortune
``If you can't win by reason, go for volume.''
Calvin, by Bill Watterson.
_______________________________________________
AniMov mailing list
[email protected]
http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
Julia Sommerfeld - PhD Candidate
Institute for Marine and Antarctic Studies
University of Tasmania
Private Bag 129, Hobart
TAS 7001
Phone: +61 458 247 348
Email: [email protected] <mailto:[email protected]>
[email protected] <mailto:[email protected]>
_______________________________________________
AniMov mailing list
[email protected]
http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov
--
~$ whoami
Mathieu Basille, Post-Doc
~$ locate
Laboratoire d'Écologie Comportementale et de Conservation de la Faune
+ Centre d'Étude de la Forêt
Département de Biologie
Université Laval, Québec
~$ info
http://ase-research.org/basille
~$ fortune
``If you can't win by reason, go for volume.''
Calvin, by Bill Watterson.
_______________________________________________
AniMov mailing list
[email protected]
http://lists.faunalia.it/cgi-bin/mailman/listinfo/animov