Writing code before agreeing on a design is putting the cart before the horse.

If there is immediate push-back on a design, then why would anyone want to write the code for it?

I don't see how HttpComponents is relevant. So what if they deprecated their decorators? Oracle continues marching on with in/out stream decorators, reader/writer decorators...

A Reader should be something simple to use, it's just a means to read characters. If I want to add buffering, I decorate it with BufferedReader. If I want to add line numbering, I decorate it with LineNumberReader. If I want to add...

The Java API includes those classes as a convenience - and we are talking about something even more low-level than CSV file parsing. The decorator pattern exists for a reason - and the Java IO classes are a perfect application of it.

I can't imagine a closer parallel to this discussion.

Adrian Crum
Sandglass Software
www.sandglass-software.com

On 10/29/2014 5:01 PM, Gary Gregory wrote:
This should be something simple to use, it's just a parser for a 'simple'
file format. I get a format, a parser and my data. If each new feature is a
decorator I'm going to end up creating 5 classes to use 5 features, no
thanks. Like you said, it could be a matter of personal taste. Over at
HttpComponents HttpClient, the decorators have been deprecated in favor of
builders. Here at CSV, we have a builder for CSV formats. So maybe that
should be used, at least it would be consistent. Also, until someone tries
to create a decorator, it's just talk, and from one of the comments, it
might not be doable...

Gary

On Wed, Oct 29, 2014 at 12:39 PM, Adrian Crum <
adrian.c...@sandglass-software.com> wrote:

Generally speaking, good designs are not based on class count.

If we simply add methods to the parser every time a user comes up with a
new use case, then eventually we will end up with a single class that tries
to be all things to all people.

The decorator pattern is ideally suited for situations like this:

1. I need a basic CSV parser: Use the basic parser.
2. I need a CSV parser that supports random access: Use basic parser
decorated by Random Access decorator.
3. I need a CSV parser that supports foo: Use basic parser decorated by
Foo decorator.
4. I need...

 From my perspective, that makes client code cleaner and simpler, not more
complicated.

But like many design discussions, that is a personal preference and others
may feel differently.

Adrian Crum
Sandglass Software
www.sandglass-software.com

On 10/29/2014 4:08 PM, Gary Gregory wrote:

On Wed, Oct 29, 2014 at 4:13 AM, Benedikt Ritter <brit...@apache.org>
wrote:

  Hi Gary,

what do you think about the Decorator approach, suggested by Adrian Crum
in
JIRA?


In theory, it's OK, but in this specific case, it does not seem to me
worth
the cost (an extra class) and complexity (it makes the client code more
complicated). It is also not clear that it would be easy to do based on
the
other comments in the JIRA. I do not plan on researching this path but I
suppose I would examine a patch.

Gary



Benedikt

2014-10-29 6:44 GMT+01:00 <ggreg...@apache.org>:

  Author: ggregory
Date: Wed Oct 29 05:44:40 2014
New Revision: 1635052

URL: http://svn.apache.org/r1635052
Log:
[CSV-131] Save positions of records to enable random access. The floor
is
open for code review and further discussion based on the comments in the
Jira.

Modified:
      commons/proper/csv/trunk/src/changes/changes.xml


  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVParser.java



  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVRecord.java



  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVParserTest.java



  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVRecordTest.java


Modified: commons/proper/csv/trunk/src/changes/changes.xml
URL:

  http://svn.apache.org/viewvc/commons/proper/csv/trunk/src/
changes/changes.xml?rev=1635052&r1=1635051&r2=1635052&view=diff



  ============================================================
==================

--- commons/proper/csv/trunk/src/changes/changes.xml (original)
+++ commons/proper/csv/trunk/src/changes/changes.xml Wed Oct 29
05:44:40
2014
@@ -39,6 +39,7 @@
     </properties>
     <body>
       <release version="1.1" date="2014-mm-dd" description="Feature and

bug

fix release">
+      <action issue="CSV-131" type="add" dev="ggregory" due-to="Holger
Stratmann">Save positions of records to enable random access</action>
         <action issue="CSV-130" type="fix" dev="ggregory" due-to="Sergei
Lebedev">CSVFormat#withHeader doesn't work well with #printComment, add
withHeaderComments(String...)</action>
         <action issue="CSV-128" type="fix"
dev="ggregory">CSVFormat.EXCEL
should ignore empty header names</action>
         <action issue="CSV-129" type="add" dev="ggregory">Add
CSVFormat#with 0-arg methods matching boolean arg methods</action>

Modified:

  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVParser.java

URL:

  http://svn.apache.org/viewvc/commons/proper/csv/trunk/src/
main/java/org/apache/commons/csv/CSVParser.java?rev=
1635052&r1=1635051&r2=1635052&view=diff



  ============================================================
==================

---

  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVParser.java

(original)
+++

  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVParser.java

Wed Oct 29 05:44:40 2014
@@ -219,6 +219,12 @@ public final class CSVParser implements
       private final List<String> record = new ArrayList<String>();

       private long recordNumber;
+
+    /**
+     * Lexer offset if the parser does not start parsing at the

beginning

of the source. Usually used in combination
+     * with {@link #setNextRecordNumber(long)}
+     */
+    private long characterOffset;

       private final Token reusableToken = new Token();

@@ -296,6 +302,43 @@ public final class CSVParser implements
       }

       /**
+     * Sets the record number to be assigned to the next record read.
+     * <p>
+     * Use this if the reader is not positioned at the first record
when
you create the parser. For example, the first
+     * record read might be the 51st record in the source file.
+     * </p>
+     * <p>
+     * If you want the records to also have the correct character
position referring to the underlying source, call
+     * {@link #setNextCharacterPosition(long)}.
+     * </p>
+     *
+     * @param nextRecordNumber
+     *            the next record number
+     * @since 1.1
+     */
+    public void setNextRecordNumber(long nextRecordNumber) {
+        this.recordNumber = nextRecordNumber - 1;
+    }
+
+    /**
+     * Sets the current position in the source stream regardless of

where

the parser and lexer start reading.
+     * <p>
+     * For example: We open a file and seek to position 5434 in order
to
start reading at record 42. In order to have
+     * the parser assign the correct characterPosition to records, we
call this method.
+     * </p>
+     * <p>
+     * If you want the records to also have the correct record numbers,
call {@link #setNextRecordNumber(long)}
+     * </p>
+     *
+     * @param position
+     *            the new character position
+     * @since 1.1
+     */
+    public void setNextCharacterPosition(long position) {
+        this.characterOffset = position - lexer.getCharacterPosition();
+    }
+
+    /**
        * Returns the current record number in the input stream.
        *
        * <p>
@@ -445,6 +488,7 @@ public final class CSVParser implements
           CSVRecord result = null;
           this.record.clear();
           StringBuilder sb = null;
+        final long startCharPosition = lexer.getCharacterPosition() +
this.characterOffset;
           do {
               this.reusableToken.reset();
               this.lexer.nextToken(this.reusableToken);
@@ -480,7 +524,7 @@ public final class CSVParser implements
               this.recordNumber++;
               final String comment = sb == null ? null : sb.toString();
               result = new CSVRecord(this.record.toArray(new
String[this.record.size()]), this.headerMap, comment,
-                    this.recordNumber);
+                    this.recordNumber, startCharPosition);
           }
           return result;
       }

Modified:

  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVRecord.java

URL:

  http://svn.apache.org/viewvc/commons/proper/csv/trunk/src/
main/java/org/apache/commons/csv/CSVRecord.java?rev=
1635052&r1=1635051&r2=1635052&view=diff



  ============================================================
==================

---

  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVRecord.java

(original)
+++

  commons/proper/csv/trunk/src/main/java/org/apache/commons/
csv/CSVRecord.java

Wed Oct 29 05:44:40 2014
@@ -36,6 +36,8 @@ public final class CSVRecord implements

       private static final long serialVersionUID = 1L;

+    private final long characterPosition;
+
       /** The accumulated comments (if any) */
       private final String comment;

@@ -44,15 +46,16 @@ public final class CSVRecord implements

       /** The record number. */
       private final long recordNumber;
-
+
       /** The values of the record */
       private final String[] values;

-    CSVRecord(final String[] values, final Map<String, Integer>
mapping,
final String comment, final long recordNumber) {
+    CSVRecord(final String[] values, final Map<String, Integer>
mapping,
final String comment, final long recordNumber, long characterPosition) {
           this.recordNumber = recordNumber;
           this.values = values != null ? values : EMPTY_STRING_ARRAY;
           this.mapping = mapping;
           this.comment = comment;
+        this.characterPosition = characterPosition;
       }

       /**
@@ -110,6 +113,16 @@ public final class CSVRecord implements
       }

       /**
+     * Returns the start position of this record as a character
position
in the source stream. This may or may not
+     * correspond to the byte position depending on the character set.
+     *
+     * @return the position of this record in the source stream.
+     */
+    public long getCharacterPosition() {
+        return characterPosition;
+    }
+
+    /**
        * Returns the comment for this record, if any.
        *
        * @return the comment for this record, or null if no comment for
this record is available.

Modified:

  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVParserTest.java

URL:

  http://svn.apache.org/viewvc/commons/proper/csv/trunk/src/
test/java/org/apache/commons/csv/CSVParserTest.java?rev=
1635052&r1=1635051&r2=1635052&view=diff



  ============================================================
==================

---

  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVParserTest.java

(original)
+++

  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVParserTest.java

Wed Oct 29 05:44:40 2014
@@ -299,22 +299,23 @@ public class CSVParserTest {
           }
       }

-//    @Test
-//    public void testStartWithEmptyLinesThenHeaders() throws
Exception

{

-//        final String[] codes = { "\r\n\r\n\r\nhello,\r\n\r\n\r\n",
"hello,\n\n\n", "hello,\"\"\r\n\r\n\r\n", "hello,\"\"\n\n\n" };
-//        final String[][] res = { { "hello", "" }, { "" }, // Excel
format does not ignore empty lines
-//                { "" } };
-//        for (final String code : codes) {
-//            final CSVParser parser = CSVParser.parse(code,
CSVFormat.EXCEL);
-//            final List<CSVRecord> records = parser.getRecords();
-//            assertEquals(res.length, records.size());
-//            assertTrue(records.size() > 0);
-//            for (int i = 0; i < res.length; i++) {
-//                assertArrayEquals(res[i], records.get(i).values());
-//            }
-//            parser.close();
-//        }
-//    }
+    // @Test
+    // public void testStartWithEmptyLinesThenHeaders() throws

Exception {

+    // final String[] codes = { "\r\n\r\n\r\nhello,\r\n\r\n\r\n",
"hello,\n\n\n", "hello,\"\"\r\n\r\n\r\n",
+    // "hello,\"\"\n\n\n" };
+    // final String[][] res = { { "hello", "" }, { "" }, // Excel
format
does not ignore empty lines
+    // { "" } };
+    // for (final String code : codes) {
+    // final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL);
+    // final List<CSVRecord> records = parser.getRecords();
+    // assertEquals(res.length, records.size());
+    // assertTrue(records.size() > 0);
+    // for (int i = 0; i < res.length; i++) {
+    // assertArrayEquals(res[i], records.get(i).values());
+    // }
+    // parser.close();
+    // }
+    // }

       @Test
       public void testEndOfFileBehaviorCSV() throws Exception {
@@ -475,6 +476,16 @@ public class CSVParserTest {
       }

       @Test
+    public void testGetRecordPositionWithCRLF() throws Exception {
+        this.validateRecordPosition(CRLF);
+    }
+
+    @Test
+    public void testGetRecordPositionWithLF() throws Exception {
+        this.validateRecordPosition(String.valueOf(LF));
+    }
+
+    @Test
       public void testGetOneLine() throws IOException {
           final CSVParser parser = CSVParser.parse(CSV_INPUT_1,
CSVFormat.DEFAULT);
           final CSVRecord record = parser.getRecords().get(0);
@@ -902,4 +913,65 @@ public class CSVParserTest {
           parser.close();
       }

+    private void validateRecordPosition(final String lineSeparator)
throws IOException {
+        final String nl = lineSeparator; // used as linebreak in values
for better distinction
+
+        String code = "a,b,c" + lineSeparator + "1,2,3" + lineSeparator

+

+        // to see if recordPosition correctly points to the enclosing
quote
+                "'A" + nl + "A','B" + nl + "B',CC" + lineSeparator +
+                // unicode test... not very relevant while operating on
strings instead of bytes, but for
+                // completeness...
+                "\u00c4,\u00d6,\u00dc" + lineSeparator + "EOF,EOF,EOF";
+
+        final CSVFormat format =

  CSVFormat.newFormat(',').withQuote('\'').withRecordSeparator(
lineSeparator);

+        CSVParser parser = CSVParser.parse(code, format);
+
+        CSVRecord record;
+        assertEquals(0, parser.getRecordNumber());
+
+        assertNotNull(record = parser.nextRecord());
+        assertEquals(1, record.getRecordNumber());
+        assertEquals(code.indexOf('a'), record.getCharacterPosition())
;
+
+        assertNotNull(record = parser.nextRecord());
+        assertEquals(2, record.getRecordNumber());
+        assertEquals(code.indexOf('1'), record.getCharacterPosition())
;
+
+        assertNotNull(record = parser.nextRecord());
+        final long positionRecord3 = record.getCharacterPosition();
+        assertEquals(3, record.getRecordNumber());
+        assertEquals(code.indexOf("'A"),
record.getCharacterPosition());
+        assertEquals("A" + lineSeparator + "A", record.get(0));
+        assertEquals("B" + lineSeparator + "B", record.get(1));
+        assertEquals("CC", record.get(2));
+
+        assertNotNull(record = parser.nextRecord());
+        assertEquals(4, record.getRecordNumber());
+        assertEquals(code.indexOf('\u00c4'),
record.getCharacterPosition());
+
+        assertNotNull(record = parser.nextRecord());
+        assertEquals(5, record.getRecordNumber());
+        assertEquals(code.indexOf("EOF"),

record.getCharacterPosition());

+
+        parser.close();
+
+        // now try to read starting at record 3
+        parser = CSVParser.parse(code.substring((int)
positionRecord3),
format);
+        parser.setNextRecordNumber(3);
+        parser.setNextCharacterPosition(positionRecord3);
+
+        assertNotNull(record = parser.nextRecord());
+        assertEquals(3, record.getRecordNumber());
+        assertEquals(code.indexOf("'A"),
record.getCharacterPosition());
+        assertEquals("A" + lineSeparator + "A", record.get(0));
+        assertEquals("B" + lineSeparator + "B", record.get(1));
+        assertEquals("CC", record.get(2));
+
+        assertNotNull(record = parser.nextRecord());
+        assertEquals(4, record.getRecordNumber());
+        assertEquals(code.indexOf('\u00c4'),
record.getCharacterPosition());
+        assertEquals("\u00c4", record.get(0));
+
+        parser.close();
+    }
   }

Modified:

  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVRecordTest.java

URL:

  http://svn.apache.org/viewvc/commons/proper/csv/trunk/src/
test/java/org/apache/commons/csv/CSVRecordTest.java?rev=
1635052&r1=1635051&r2=1635052&view=diff



  ============================================================
==================

---

  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVRecordTest.java

(original)
+++

  commons/proper/csv/trunk/src/test/java/org/apache/commons/
csv/CSVRecordTest.java

Wed Oct 29 05:44:40 2014
@@ -45,12 +45,12 @@ public class CSVRecordTest {
       @Before
       public void setUp() throws Exception {
           values = new String[] { "A", "B", "C" };
-        record = new CSVRecord(values, null, null, 0);
+        record = new CSVRecord(values, null, null, 0, -1);
           header = new HashMap<String, Integer>();
           header.put("first", Integer.valueOf(0));
           header.put("second", Integer.valueOf(1));
           header.put("third", Integer.valueOf(2));
-        recordWithHeader = new CSVRecord(values, header, null, 0);
+        recordWithHeader = new CSVRecord(values, header, null, 0, -1);
       }

       @Test





--
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to