Re: [R] what is the best way to process the following data?

2016-06-17 Thread William Dunlap via R-help
You can make a step-number variable with cumsum(grepl("^Step ", ...)) and
use it as the splitting variable in split.  E.g.,

> dat <- read.table(yourFile, stringsAsFactors=FALSE, sep="|",
colClasses=c("NULL", "character", "character", "character"),
col.names=c("Junk","Date","Time","Type"))
> dat <- with(dat, data.frame(DateTime=as.POSIXct(paste(Date, Time),
format="%m/%d/%Y %H:%M:%S"), Type=Type, stringsAsFactors=FALSE))
> head(dat)
 DateTime   Type
1 2016-06-16 03:44:16   Step 001
2 2016-06-16 03:44:16 Initialization
3 2016-06-16 03:44:16Filters
4 2016-06-16 03:45:03Split Items
5 2016-06-16 03:46:20   Sort
6 2016-06-16 03:46:43  Check
> split(dat, cumsum(grepl("^Step ", dat$Type)))
$`1`
  DateTimeType
1  2016-06-16 03:44:16Step 001
2  2016-06-16 03:44:16  Initialization
...
13 2016-06-16 04:06:33 BOP processing for 7,960 items has finished

$`2`
  DateTimeType
14 2016-06-16 04:06:34Step 002
15 2016-06-16 04:06:35  Initialization
...



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Jun 16, 2016 at 8:42 PM, Satish Vadlamani <
satish.vadlam...@gmail.com> wrote:

> Hello,
> I have multiple text files with the format shown below (see the two files
> that I pasted below). Each file is a log of multiple steps that the system
> has processed and for each step, it has shown the start time of the process
> step. For example, in the data below, the filter started at
> |06/16/2016|03:44:16
>
> How to read this data so that Step 001 is one data frame, Step 002 is
> another, and so on. After I do this, I will then compare the Step 001 times
> with and without parallel process.
>
> For example, the files pasted below "no_parallel_process_SLS_4.txt" and
> "parallel_process_SLS_4.txt" will make it clear what I am trying to do. I
> want to compare the parallel process times taken for each step with the non
> parallel process times.
>
> If there are better ways of performing this task that what I am thinking,
> could you let me know? Thanks in advance.
>
> Satish Vadlamani
>
> >> parallel_process_file.txt
>
> |06/16/2016|03:44:16|Step 001
> |06/16/2016|03:44:16|Initialization
> |06/16/2016|03:44:16|Filters
> |06/16/2016|03:45:03|Split Items
> |06/16/2016|03:46:20|Sort
> |06/16/2016|03:46:43|Check
> |06/16/2016|04:01:13|Save
> |06/16/2016|04:04:35|Update preparation
> |06/16/2016|04:04:36|Update comparison
> |06/16/2016|04:04:38|Update
> |06/16/2016|04:04:38|Update
> |06/16/2016|04:06:01|Close
> |06/16/2016|04:06:33|BOP processing for 7,960 items has finished
> |06/16/2016|04:06:34|Step 002
> |06/16/2016|04:06:35|Initialization
> |06/16/2016|04:06:35|Filters
> |06/16/2016|04:07:14|Split Items
> |06/16/2016|04:08:57|Sort
> |06/16/2016|04:09:06|Check
> |06/16/2016|04:26:36|Save
> |06/16/2016|04:39:29|Update preparation
> |06/16/2016|04:39:31|Update comparison
> |06/16/2016|04:39:43|Update
> |06/16/2016|04:39:45|Update
> |06/16/2016|04:44:28|Close
> |06/16/2016|04:45:26|BOP processing for 8,420 items has finished
> |06/16/2016|04:45:27|Step 003
> |06/16/2016|04:45:27|Initialization
> |06/16/2016|04:45:27|Filters
> |06/16/2016|04:48:50|Split Items
> |06/16/2016|04:55:15|Sort
> |06/16/2016|04:55:40|Check
> |06/16/2016|05:13:35|Save
> |06/16/2016|05:17:34|Update preparation
> |06/16/2016|05:17:34|Update comparison
> |06/16/2016|05:17:36|Update
> |06/16/2016|05:17:36|Update
> |06/16/2016|05:19:29|Close
> |06/16/2016|05:19:49|BOP processing for 8,876 items has finished
> |06/16/2016|05:19:50|Step 004
> |06/16/2016|05:19:50|Initialization
> |06/16/2016|05:19:50|Filters
> |06/16/2016|05:20:43|Split Items
> |06/16/2016|05:22:14|Sort
> |06/16/2016|05:22:29|Check
> |06/16/2016|05:37:27|Save
> |06/16/2016|05:38:43|Update preparation
> |06/16/2016|05:38:44|Update comparison
> |06/16/2016|05:38:45|Update
> |06/16/2016|05:38:45|Update
> |06/16/2016|05:39:09|Close
> |06/16/2016|05:39:19|BOP processing for 5,391 items has finished
> |06/16/2016|05:39:20|Step 005
> |06/16/2016|05:39:20|Initialization
> |06/16/2016|05:39:20|Filters
> |06/16/2016|05:39:57|Split Items
> |06/16/2016|05:40:21|Sort
> |06/16/2016|05:40:24|Check
> |06/16/2016|05:46:01|Save
> |06/16/2016|05:46:54|Update preparation
> |06/16/2016|05:46:54|Update comparison
> |06/16/2016|05:46:54|Update
> |06/16/2016|05:46:55|Update
> |06/16/2016|05:47:24|Close
> |06/16/2016|05:47:31|BOP processing for 3,016 items has finished
> |06/16/2016|05:47:32|Step 006
> |06/16/2016|05:47:32|Initialization
> |06/16/2016|05:47:32|Filters
> |06/16/2016|05:47:32|Update preparation
> |06/16/2016|05:47:32|Update comparison
> |06/16/2016|05:47:32|Update
> |06/16/2016|05:47:32|Close
> |06/16/2016|05:47:33|BOP processing for 0 items has finished
> |06/16/2016|05:47:33|Step 007
> |06/16/2016|05:47:33|Initialization
> 

[R] what is the best way to process the following data?

2016-06-17 Thread Satish Vadlamani
Hello,
I have multiple text files with the format shown below (see the two files
that I pasted below). Each file is a log of multiple steps that the system
has processed and for each step, it has shown the start time of the process
step. For example, in the data below, the filter started at
|06/16/2016|03:44:16

How to read this data so that Step 001 is one data frame, Step 002 is
another, and so on. After I do this, I will then compare the Step 001 times
with and without parallel process.

For example, the files pasted below "no_parallel_process_SLS_4.txt" and
"parallel_process_SLS_4.txt" will make it clear what I am trying to do. I
want to compare the parallel process times taken for each step with the non
parallel process times.

If there are better ways of performing this task that what I am thinking,
could you let me know? Thanks in advance.

Satish Vadlamani

>> parallel_process_file.txt

|06/16/2016|03:44:16|Step 001
|06/16/2016|03:44:16|Initialization
|06/16/2016|03:44:16|Filters
|06/16/2016|03:45:03|Split Items
|06/16/2016|03:46:20|Sort
|06/16/2016|03:46:43|Check
|06/16/2016|04:01:13|Save
|06/16/2016|04:04:35|Update preparation
|06/16/2016|04:04:36|Update comparison
|06/16/2016|04:04:38|Update
|06/16/2016|04:04:38|Update
|06/16/2016|04:06:01|Close
|06/16/2016|04:06:33|BOP processing for 7,960 items has finished
|06/16/2016|04:06:34|Step 002
|06/16/2016|04:06:35|Initialization
|06/16/2016|04:06:35|Filters
|06/16/2016|04:07:14|Split Items
|06/16/2016|04:08:57|Sort
|06/16/2016|04:09:06|Check
|06/16/2016|04:26:36|Save
|06/16/2016|04:39:29|Update preparation
|06/16/2016|04:39:31|Update comparison
|06/16/2016|04:39:43|Update
|06/16/2016|04:39:45|Update
|06/16/2016|04:44:28|Close
|06/16/2016|04:45:26|BOP processing for 8,420 items has finished
|06/16/2016|04:45:27|Step 003
|06/16/2016|04:45:27|Initialization
|06/16/2016|04:45:27|Filters
|06/16/2016|04:48:50|Split Items
|06/16/2016|04:55:15|Sort
|06/16/2016|04:55:40|Check
|06/16/2016|05:13:35|Save
|06/16/2016|05:17:34|Update preparation
|06/16/2016|05:17:34|Update comparison
|06/16/2016|05:17:36|Update
|06/16/2016|05:17:36|Update
|06/16/2016|05:19:29|Close
|06/16/2016|05:19:49|BOP processing for 8,876 items has finished
|06/16/2016|05:19:50|Step 004
|06/16/2016|05:19:50|Initialization
|06/16/2016|05:19:50|Filters
|06/16/2016|05:20:43|Split Items
|06/16/2016|05:22:14|Sort
|06/16/2016|05:22:29|Check
|06/16/2016|05:37:27|Save
|06/16/2016|05:38:43|Update preparation
|06/16/2016|05:38:44|Update comparison
|06/16/2016|05:38:45|Update
|06/16/2016|05:38:45|Update
|06/16/2016|05:39:09|Close
|06/16/2016|05:39:19|BOP processing for 5,391 items has finished
|06/16/2016|05:39:20|Step 005
|06/16/2016|05:39:20|Initialization
|06/16/2016|05:39:20|Filters
|06/16/2016|05:39:57|Split Items
|06/16/2016|05:40:21|Sort
|06/16/2016|05:40:24|Check
|06/16/2016|05:46:01|Save
|06/16/2016|05:46:54|Update preparation
|06/16/2016|05:46:54|Update comparison
|06/16/2016|05:46:54|Update
|06/16/2016|05:46:55|Update
|06/16/2016|05:47:24|Close
|06/16/2016|05:47:31|BOP processing for 3,016 items has finished
|06/16/2016|05:47:32|Step 006
|06/16/2016|05:47:32|Initialization
|06/16/2016|05:47:32|Filters
|06/16/2016|05:47:32|Update preparation
|06/16/2016|05:47:32|Update comparison
|06/16/2016|05:47:32|Update
|06/16/2016|05:47:32|Close
|06/16/2016|05:47:33|BOP processing for 0 items has finished
|06/16/2016|05:47:33|Step 007
|06/16/2016|05:47:33|Initialization
|06/16/2016|05:47:33|Filters
|06/16/2016|05:47:34|Split Items
|06/16/2016|05:47:34|Sort
|06/16/2016|05:47:34|Check
|06/16/2016|05:47:37|Save
|06/16/2016|05:47:37|Update preparation
|06/16/2016|05:47:37|Update comparison
|06/16/2016|05:47:37|Update
|06/16/2016|05:47:37|Update
|06/16/2016|05:47:37|Close
|06/16/2016|05:47:37|BOP processing for 9 items has finished
|06/16/2016|05:47:37|Step 008
|06/16/2016|05:47:37|Initialization
|06/16/2016|05:47:37|Filters
|06/16/2016|05:47:38|Update preparation
|06/16/2016|05:47:38|Update comparison
|06/16/2016|05:47:38|Update
|06/16/2016|05:47:38|Close
|06/16/2016|05:47:38|BOP processing for 0 items has finished




>> no_parallel_process_file.txt

|06/15/2016|22:52:46|Step 001
|06/15/2016|22:52:46|Initialization

|06/15/2016|22:52:46|Filters

|06/15/2016|22:54:21|Split Items

|06/15/2016|22:55:10|Sort

|06/15/2016|22:55:15|Check

|06/15/2016|23:04:43|Save

|06/15/2016|23:06:38|Update preparation

|06/15/2016|23:06:38|Update comparison

|06/15/2016|23:06:39|Update

|06/15/2016|23:06:39|Update

|06/15/2016|23:12:04|Close

|06/15/2016|23:13:16|BOP processing for 7,942 items has finished

|06/15/2016|23:13:17|Step 002
|06/15/2016|23:13:17|Initialization

|06/15/2016|23:13:17|Filters

|06/15/2016|23:16:27|Split Items

|06/15/2016|23:20:18|Sort

|06/15/2016|23:20:34|Check

|06/16/2016|00:08:08|Save

|06/16/2016|00:26:19|Update preparation

|06/16/2016|00:26:20|Update comparison

|06/16/2016|00:26:30|Update

|06/16/2016|00:26:31|Update

|06/16/2016|00:42:31|Close

|06/16/2016|00:45:09|BOP processing for 8,400