[PHP-DB] Re: txt to db, file() bug?

2003-10-23 Thread Justin Patrin
You may be having problems with your delimiter (,) being within quotes. 
This can be a problem both for your ',' splitting and for the file() 
splitting (\n). fgetcsv() seems to do some of this, but I'm not sure of 
its completeness. Here's a function I wrote and some possible code for you:

/**
 * does a regular split, but also accounts for the deliminator to be 
within quoted fields
 *  for example, if called as such:
 *   splitQuoteFriendly(',', '0,1,2,3,I am still 3,4');
 *  it will return:
 *   array(0 = '0',
 * 1 = '1',
 * 2 = '2',
 * 3 = '3,I am still 3',
 * 4 = '4');
 * @param string deliminator to split by
 * @param string text to split
 * @param string text which surrounds quoted fields (defaults to )
 * @return array array of fields after split
 */
function splitQuoteFriendly($delim, $text, $quote = '') {
  $strictFields = split($delim, $text);
  for($sl = 0, $l = 0; $sl  sizeof($strictFields); ++$sl) {
$fields[$l] = $strictFields[$sl];
$numQuotes = 0;
while(fmod($numQuotes += substr_count($strictFields[$sl], $quote), 
2) == 1) {
  ++$sl;
  $fields[$l] .= $delim.$strictFields[$sl];
}
++$l;
  }

  return $fields;
}
$text = file_get_contents($filename);
$lines = splitQuoteFriendly(\n, $text, ');
foreach($lines as $line) {
  $parts = splitQuoteFriendly(',', $line, ');
}
Justin Patrin

Jeffrey N Dyke wrote:

OS Solaris8
php4.3.2 and php 4.2.3.  running as CLI from cron and command line.
I have a log file that i'm processing that i need to in turn insert only
certian fields into the database.  I open the files using
file(filename.csv),which returns the array for me to process.  i process
all the lines of the file with:
while(list($line_number, line) = each($fp)) {
  $line_array = explode(,,$line);
etc
There are 15 comma seperated 'columns' in the csv file, of which i only
need to input 6.  every once in a while i don't get all of the data within
the fields, or the comma (,) becomes part of the field.  But then the other
95% of the rows process correctly.
Below is some sample data, i'm happy to provide more to anyone with
questions.  thansk for any opinions.  I hope this makes [some] sense.
Jeff

The assignment of the values that i'm putting into my queries are
simply
$query = INSERT INTO TA_RAFEED_JD SET .
   LoggedAt = '.$line_array[0]. ', .
   User_Name = '.$line_array[2].', .
   Network_Device_Group = '.$line_array[3].', .
   Group_Name = 'Dial', .
   Caller_Id = '.$line_array[4]. ', .
  elapsed_time = .$line_array[6];
trims 0's off values and changes them.???
 Good log line prior to error'd line--
Log File--
10/22/2003,07:34:23,000512,Dial-VPN,5022278197/8885326281,stop,2135,ppp,56671,46034,581,243,89602,192.168.98.218,Async190,10.1.1.28,
DB Query - INSERT INTO TA_RAFEED_JD SET LoggedAt = '2003-10-22', User_Name
= '000512', Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial',
Caller_Id = '5022278197/8885326281', elapsed_time = 2135
 trimmed the two leading zeros off the string 002375 and made
it 22375---
LogFile --
10/22/2003,07:34:42,002375,Dial-VPN,7877910505/8885326281,stop,1055,ppp,150304,196686,419,317,89607,192.168.98.115,Async55,10.1.1.28,
Db query -- INSERT INTO TA_RAFEED_JD SET LoggedAt = '2003-10-22', User_Name
= '22375', Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial',
Caller_Id = '7877910505/8885326281', elapsed_time = 1055
--same as above --
10/22/2003,16:52:59,007138,Dial-VPN,2122011073/8885326281,stop,205,ppp,28543,51643,171,113,89784,192.168.98.232,Async100,10.1.1.28,
INSERT INTO TA_RAFEED_JD SET LoggedAt = '2003-10-22', User_Name = '17138',
Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial', Caller_Id
= '2122011073/8885326281', elapsed_time = 205
chooses the WRONG value..the value that its putting in the 'User_Name'
field is a contatenation of other fields, not being processed.???
correct line prior to error --
10/22/2003,08:02:21,002357,Dial-VPN,5072878991/8885326281,stop,3580,ppp,650218,6754168,6947,6858,89604,192.168.98.65,Async23,10.1.1.28,INSERT
 I
NTO TA_RAFEED_JD SET LoggedAt = '2004-10-22', User_Name = '002357',
Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial', Caller_Id = '507287
8991/8885326281', elapsed_time = 3580
--errord record -- instead of inserting ka1497 to 'User_Name' i got
'stop7'--
10/22/2003,08:17:50,ka1497,Dial-VPN,4169249221/8885326281,stop,5127,ppp,738105,4533604,4701,4603,89600,192.168.98.85,Async186,10.1.1.28,INSERT
INTO TA_RAFEED_JD SET LoggedAt = '2003-10-22', User_Name = 'stop7',
Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial', Caller_Id = '416924
9221/8885326281', elapsed_time = 5127
same as above.  inserted stop1 instead of 001691-
10/22/2003,12:02:19,001691,Dial-VPN,4142713011/8885326281,stop,754,ppp,65008,243490,419,365,89696,192.168.98.114,Async124,10.1.1.28,INSERT
 INTO
 TA_RAFEED_JD SET LoggedAt = '2003-10-22', User_Name = 'stop1',

Re: [PHP-DB] Re: txt to db, file() bug?

2003-10-23 Thread jeffrey_n_Dyke

hmmm.  thanks for this.  the file actually does not have quoted entries
between the commas.  it is just:
text,text,text . ,text \n

that does not take away from the assitance...gives me more things to think
about.

thank you.
jeff


   

  Justin Patrin

  [EMAIL PROTECTED]To:   [EMAIL PROTECTED]
  
  com cc: 

   Subject:  [PHP-DB] Re: txt to db, 
file() bug?   
  10/23/2003 04:19 

  PM   

   

   





You may be having problems with your delimiter (,) being within quotes.
This can be a problem both for your ',' splitting and for the file()
splitting (\n). fgetcsv() seems to do some of this, but I'm not sure of
its completeness. Here's a function I wrote and some possible code for you:

/**
  * does a regular split, but also accounts for the deliminator to be
within quoted fields
  *  for example, if called as such:
  *   splitQuoteFriendly(',', '0,1,2,3,I am still 3,4');
  *  it will return:
  *   array(0 = '0',
  * 1 = '1',
  * 2 = '2',
  * 3 = '3,I am still 3',
  * 4 = '4');
  * @param string deliminator to split by
  * @param string text to split
  * @param string text which surrounds quoted fields (defaults to )
  * @return array array of fields after split
  */
function splitQuoteFriendly($delim, $text, $quote = '') {
   $strictFields = split($delim, $text);
   for($sl = 0, $l = 0; $sl  sizeof($strictFields); ++$sl) {
 $fields[$l] = $strictFields[$sl];
 $numQuotes = 0;
 while(fmod($numQuotes += substr_count($strictFields[$sl], $quote),
2) == 1) {
   ++$sl;
   $fields[$l] .= $delim.$strictFields[$sl];
 }
 ++$l;
   }

   return $fields;
}

$text = file_get_contents($filename);
$lines = splitQuoteFriendly(\n, $text, ');
foreach($lines as $line) {
   $parts = splitQuoteFriendly(',', $line, ');
}


Justin Patrin


Jeffrey N Dyke wrote:

 OS Solaris8
 php4.3.2 and php 4.2.3.  running as CLI from cron and command line.

 I have a log file that i'm processing that i need to in turn insert only
 certian fields into the database.  I open the files using
 file(filename.csv),which returns the array for me to process.  i process
 all the lines of the file with:
 while(list($line_number, line) = each($fp)) {
   $line_array = explode(,,$line);
 etc

 There are 15 comma seperated 'columns' in the csv file, of which i only
 need to input 6.  every once in a while i don't get all of the data
within
 the fields, or the comma (,) becomes part of the field.  But then the
other
 95% of the rows process correctly.

 Below is some sample data, i'm happy to provide more to anyone with
 questions.  thansk for any opinions.  I hope this makes [some] sense.

 Jeff

 The assignment of the values that i'm putting into my queries are
 simply

 $query = INSERT INTO TA_RAFEED_JD SET .
LoggedAt = '.$line_array[0]. ', .
User_Name = '.$line_array[2].', .
Network_Device_Group = '.$line_array[3].', .
Group_Name = 'Dial', .
Caller_Id = '.$line_array[4]. ', .
   elapsed_time = .$line_array[6];

 trims 0's off values and changes them.???
  Good log line prior to error'd line--
 Log File--

10/22/2003,07:34:23,000512,Dial-VPN,5022278197/8885326281,stop,2135,ppp,56671,46034,581,243,89602,192.168.98.218,Async190,10.1.1.28,

 DB Query - INSERT INTO TA_RAFEED_JD SET LoggedAt = '2003-10-22',
User_Name
 = '000512', Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial',
 Caller_Id = '5022278197/8885326281', elapsed_time = 2135
  trimmed the two leading zeros off the string 002375 and made
 it 22375---
 LogFile --

10/22/2003,07:34:42,002375,Dial-VPN,7877910505/8885326281,stop,1055,ppp,150304,196686,419,317,89607,192.168.98.115,Async55,10.1.1.28,

 Db query -- INSERT INTO TA_RAFEED_JD SET LoggedAt = '2003-10-22',
User_Name
 = '22375', Network_Device_Group = 'Dial-VPN', Group_Name = 'Dial',
 Caller_Id = '7877910505/8885326281