Hi Jeff!

We've just been doing the same thing down here. CSV bank data threw me out
as well. I used the java.util.StringTokenizer rather than the utils classes,
which kindly mishandled empty values
("123,,223.69,12-11-2003,8495-399,DIRECT DEBIT"). If it's missing the first
field, you can bet it'll do the same for empty fields in-line. 

We solved this problem by implementing a "strict" tokenizer:

------[com.genix.commons.tools.StrictTokenizer]--------------

package com.genix.commons.tools;

import java.util.Enumeration;
import java.util.NoSuchElementException;
import java.util.Vector;

/**
 * @author Greg Kerdemelidis ([EMAIL PROTECTED])
 */
public class StrictTokenizer implements Enumeration
{
        String _data;
        char _delim;
        Vector _tokens;
        int _count;
        
        public StrictTokenizer(String input, char delim)
        {
                _data=input;
                _delim=delim;
                parse();
                _count=0;
        }
        
        private void parse()
        {
                char[] dat = _data.toCharArray();
                _tokens = new Vector();
                
                char[] buff = new char[100];            // bad bad bad
                int buffc = 0;
                
                for (int i = 0; i < dat.length; i++)
                {
                        char c = dat[i];
                        
                        if(c!=_delim)
                                buff[buffc++]=c;
                        else
                        {
                                
                                if(buffc==1)
                                        _tokens.add("");
                                else
        
_tokens.add(String.copyValueOf(buff,0,buffc));
                                        
                                buffc=0;        // reset buff pointer
                        }
                }
        }
        
        public String getNext() throws NoSuchElementException
        {
                if(_count==_tokens.size())
                        throw new NoSuchElementException();
                return (String)_tokens.get(_count++);
        }
        
        public String nextToken() throws NoSuchElementException
        {
                return getNext();
        }
        
        public boolean hasNext()
        {
                return _count>=_tokens.size();
        }
        
        public int getTokens()
        {
                return _tokens.size();
        }
        
        public void reset(int num)
        {
                _count=num;
        }
        
        public void reset()
        {
                this.reset(0);
        }

        public boolean hasMoreElements()
        {
                return hasNext();
        }

        public Object nextElement()
        {
                return getNext();
        }
}

-------[end]-------------

Use it the same as the java.util.StringTokenizer:

String inputLine = CSV.readLine();
StrictTokenizer tok = new StrictTokenizer(inputLine, ',');

String first = tok.getNext();   
String second = tok.getNext();

It'll only throw a NoSuchElementException if you iterate off the end of the
string.

The programming police won't like my "char[] buff = new char[100];", but
then neither do I. YMMV - it's perfect for our use as no CSV input line is
greater than 100 chars.

Hope that helps, mate!

Regards,

-Greg


> -----Original Message-----
> From: Jeff Painter [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, 12 November 2003 8:37 a.m.
> To: [EMAIL PROTECTED]
> Subject: question on CSVParser
> 
> 
>  org.apache.turbine.util.CSVParser is giving me some trouble
> 
> I have an import routine that will probably have many fields left blank.
> Unfortunately it looks like if the first field is left blank, it throws
> off that whole line for retrieving values with ValueParser.
> 
> example
> 
>       row 1 always contains my headers
>       System ID, Bank, Branch Number, Address
> 
>       subsequent rows carry data
>       100,"Bank of America", 1230, "123 Main St."
>       ,"Bank of America", 1432, "923 Front St."
> 
> I want to keep the format of the csv consistent for allowing multiple
> updates as well as new data entry en masse.
> 
> For line 1, all the data comes through correctly, if ValueParser can find
> a system id, then it will attempt to update the info rather than create a
> new entry. My goal is that for line 2, it will see no value for System ID
> and then attempt to create a new entry.
> 
> However, from my logs, all the fields are shifted one place to the left
> since System Id is blank
> 
> log output:
> 
> [Tue Nov 11 14:18:18 EST 2003] -- INFO -- Adding new PT_BRANCH
> [Tue Nov 11 14:18:18 EST 2003] -- DEBUG -- ID not found, input as new
> participant.
> [Tue Nov 11 14:18:18 EST 2003] -- DEBUG -- Bank: 8830
> [Tue Nov 11 14:18:18 EST 2003] -- DEBUG -- Branch #: Franklinton
> [Tue Nov 11 14:18:18 EST 2003] -- DEBUG -- Branch: SWL - SW Rural (Lake
> Charles)
> [Tue Nov 11 14:18:18 EST 2003] -- DEBUG -- Region: 946 Pearl Street
> 
> so it should have been branch # 8830 and not Franklinton... bleah
> 
> any pointers on how to fix this or do I have to setup two different
> routines... one for imports and one for updates. I haven't tested to see
> if fields left blank in between are accounted for correctly.
> 
> I'm using tdk-2.2_01 release of turbine and utils
> 
> --
> Regards,
> 
> Jeffery Painter
> 
> - --
> [EMAIL PROTECTED]                     http://kiasoft.com
> PGP FP: 9CE8 83A2 33FA 32B1 0AB1  4E62 E4CB E4DA 5913 EFBC
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
> 
> iD8DBQE/qEQE5Mvk2lkT77wRAnMJAJ9vJ6qOkg/mvqqIpz7troCEQJ8bFACglu/U
> YNXabx7DZOV2Hd9LwSTmGpY=
> =dWiu
> -----END PGP SIGNATURE-----
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to