Re: [RDF] adding Quads - Bulk

2024-03-10 Thread Peter Hull
I don't know much about RDF4J but my comments would be
1. Better IMO to add a default method to the interface rather than
adding 11 implementations which are almost identical
2. No need to restrict it to a List when any iterable would do
3. There is a bulk 'add' method in RepositoryConnection
(https://rdf4j.org/javadoc/latest/org/eclipse/rdf4j/repository/RepositoryConnection.html#add(java.lang.Iterable,org.eclipse.rdf4j.model.Resource...)
- might be better to use it
4. Why was RDF4JServiceLoaderTest.java deleted?
5. Method name 'addAll' is more consistent with Java collections but
it seems RDF4J uses 'add' overloaded for single or multiple additions
- don't know if Commons has any preference on that.

Hope that's helpful,
Pete

On Sun, 10 Mar 2024 at 13:03, Fred Hauschel  wrote:
>
> Ah, sorry. i forgot to post this PR here:
> https://github.com/apache/commons-rdf/pull/205
> I did it right after the mail.
>
> Thanks Fredy
>
> On 10.03.24 13:42, Peter Hull wrote:
> > On Sat, 9 Mar 2024 at 22:37, Fred Hauschel  wrote:
> >
> >> Is there a reason, why there is no Method like GraphLike#add(List
> >> statements); ?
> > It would be possible to add a method with a default implementation to
> > GraphLike - then look to see if this can be more efficient for
> > RDF4J (e.g. avoiding the multiple connections)
> > I don't think this would break compatibility unless someone had
> > already implemented addAll on a subclass, with a different signature.
> > Peter
> >
> > diff --git 
> > a/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
> > b/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
> > index f50423f8..0b7b936a 100644
> > --- 
> > a/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
> > +++ 
> > b/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
> > @@ -55,6 +55,17 @@ public interface GraphLike {
> >*/
> >   void add(T statement);
> >
> > +/**
> > + * Add a collection of statements.
> > + *
> > + * @param statements the TripleLike statements to add
> > + */
> > +default void addAll(Iterable statements) {
> > +for (T statement : statements) {
> > +add(statement);
> > +}
> > +}
> > +
> >   /**
> >* Remove all statements.
> >*/
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [RDF] adding Quads - Bulk

2024-03-10 Thread Peter Hull
On Sat, 9 Mar 2024 at 22:37, Fred Hauschel  wrote:

> Is there a reason, why there is no Method like GraphLike#add(List
> statements); ?

It would be possible to add a method with a default implementation to
GraphLike - then look to see if this can be more efficient for
RDF4J (e.g. avoiding the multiple connections)
I don't think this would break compatibility unless someone had
already implemented addAll on a subclass, with a different signature.
Peter

diff --git 
a/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
b/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
index f50423f8..0b7b936a 100644
--- a/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
+++ b/commons-rdf-api/src/main/java/org/apache/commons/rdf/api/GraphLike.java
@@ -55,6 +55,17 @@ public interface GraphLike {
  */
 void add(T statement);

+/**
+ * Add a collection of statements.
+ *
+ * @param statements the TripleLike statements to add
+ */
+default void addAll(Iterable statements) {
+for (T statement : statements) {
+add(statement);
+}
+}
+
 /**
  * Remove all statements.
  */

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [COMPRESS] Help needed to fix COMPRESS-651

2024-03-09 Thread Peter Hull
On Sat, 9 Mar 2024 at 18:19, Peter Hull  wrote:
> There's another constructor for BZip2CompressorInputStream which
> allows for this, it's not the default.
Specifically, the patch below makes the test pass. Whether this should
be default for the one-arg constructor is a matter for discussion.
Peter
diff --git 
a/src/test/java/org/apache/commons/compress/compressors/bzip2/BZip2Compress651Test.java
b/src/test/java/org/apache/commons/compress/compressors/bzip2/BZip2Compress651Test.java
index d5c7e17c9..079a90c37 100644
--- 
a/src/test/java/org/apache/commons/compress/compressors/bzip2/BZip2Compress651Test.java
+++ 
b/src/test/java/org/apache/commons/compress/compressors/bzip2/BZip2Compress651Test.java
@@ -38,13 +38,12 @@
 public class BZip2Compress651Test {

 @Test
-@Disabled
 public void testCompress651() throws IOException {
 final int buffersize = 102_400;
 final Path pathIn =
Paths.get("src/test/resources/org/apache/commons/compress/COMPRESS-651/my10m.tar.bz2");
 final Path pathOut = Paths.get("target/COMPRESS-651/test.tar");
 Files.createDirectories(pathOut.getParent());
-try (BZip2CompressorInputStream inputStream = new
BZip2CompressorInputStream(new
BufferedInputStream(Files.newInputStream(pathIn)));
+try (BZip2CompressorInputStream inputStream = new
BZip2CompressorInputStream(new
BufferedInputStream(Files.newInputStream(pathIn)), true);
 OutputStream outputStream = Files.newOutputStream(pathOut)) {
 IOUtils.copy(inputStream, outputStream, buffersize);
 }

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [COMPRESS] Help needed to fix COMPRESS-651

2024-03-09 Thread Peter Hull
On Sat, 9 Mar 2024 at 14:33, Gary Gregory  wrote:
>
> If you want to help in Commons COMPRESS-land, please see
> https://issues.apache.org/jira/browse/COMPRESS-651
I swear I looked into this a while ago, and the issue was pbzip2 works
by compressing the source in chunks, in parallel, then cat'ing those
compressed chunks together.
There's another constructor for BZip2CompressorInputStream which
allows for this, it's not the default.
I can't find any record of it though, maybe I'm losing my mind.
Peter

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [COMPRESS] Decompress BZIP2 File Max Output is 900000 chars

2024-01-31 Thread Peter Hull
I can't add to the JIRA bug but I had a quick play on WSL (debian),
Java 21, compress 1.25.0 and found:
Using dd if=/dev/random I could create a big file, compress it with
bzip2 and then decompress it with BZip2CompressorInputStream , no
problems
Same file compressed with pbzip2 was truncated at 90 as described.
Those 90 bytes were just the first 90 bytes of the correct output
So it is pbzip2 vs bzip2, nothing to do with tar files.

Description for BZip2CompressorInputStream
(https://commons.apache.org/proper/commons-compress/apidocs/org/apache/commons/compress/compressors/bzip2/BZip2CompressorInputStream.html)
says there is another constructor with a boolean flag for
decompressing concatenated files.

Using this constructor appears to work OK.

Therefore I assume that pbzip2 creates concatenated bzip files?

Hope that helps
Peter

On Wed, 31 Jan 2024 at 12:57, Gary D. Gregory  wrote:
>
> Hi All,
>
> If anyone is looking for an issue to investigate:
>
> [COMPRESS-651] Decompress BZIP2 File Max Output is 90 chars
> https://issues.apache.org/jira/browse/COMPRESS-651
>
> Gary
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CRYPTO] Why does Makefile use CXX (C++) for linking?

2023-11-02 Thread Peter Hull
On Thu, 2 Nov 2023 at 10:35, sebb  wrote:
> On macOS, CC and CXX have the same definition, so it's not surprising
> there was no difference in your testing.
Face palm. Sorry

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CRYPTO] Why does Makefile use CXX (C++) for linking?

2023-11-02 Thread Peter Hull
On Wed, 1 Nov 2023 at 23:54, Alex Remily  wrote:
> I believe it is for cross compilation, owing to the comments in the
> makefile:
> # for cross-compilation on Ubuntu, install the g++-mingw-w64-x86-64 package
I think you could also argue that the other way around - ie. those are
the packages you need if you want to cross-compile C++.

I quickly tried replacing CXX with CC on a Mac running OS 10.15 and,
admittedly both failed, but in the same way, finding "LibreSSL" in the
version string rather than "OpenSSL". This is mentioned in
BUILDING.txt and I assume it could be fixed by following the
suggestion in that file, but I can't risk messing anything up just at
the moment. So I don't think in that case using the C linker instead
of the C++ linker caused any problems.

This seems like a bit of a "Chesterton's Fence" to me!

Peter

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [IMAGING] Logging vs Throwing exceptions

2023-05-31 Thread Peter Hull
That whole class looks like it needs a bit of TLC (or Javadoc at least!)

On Wed, 31 May 2023 at 06:49, Miguel Muñoz  wrote:
>
>
> In addition to logging and swallowing the exception, this method also then 
> returns null. This is also a bad practice.
>
>
>
> The caller has to check for null. One of the reasons exceptions were invented 
> was to free the user from needing to check for null or error codes.
>
> — Miguel
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CSV] New feature to allow access to leading/trailing comments in CSV files?

2022-09-06 Thread Peter Hull
On Tue, 6 Sept 2022 at 15:56, Gilles Sadowski  wrote:

>
> About your patch: It is preferable to have a separate test method for
> each test case.  If there is no better description, it is fine to append
> a "counter" to the "common" test name. i.e.
>
> Hi Gilles,
I have done this, partly, and there are 14 test methods. I still have two
tests in each method, one for hasXXX() and one for getXXX(). It seems a bit
excessive already. In your judgement, should I cut some of them out?
https://github.com/apache/commons-csv/pull/257/commits/0414d1e4b79a4f42d24c8b9a7547a8cbf4a40cf0
Peter


Re: [CSV] New feature to allow access to leading/trailing comments in CSV files?

2022-09-06 Thread Peter Hull
Hi Gary,
Thanks for that, I've done it now. I didn't really mean to ask "how" to
submit a pull request, more "where" to submit it, as the Apache page just
mentions a repo at gitbox.apache.org and the Contributing page describes
attaching a patch file derived from SVN. I assumed the github repo was just
mirrored for convenience.
Peter

On Tue, 6 Sept 2022 at 15:23, Gary Gregory  wrote:

> Please see
>
> https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request
>
> Gary
>
> On Tue, Sep 6, 2022, 06:05 Peter Hull  wrote:
>
> > Hi Bruno,
> > Thanks for the swift reply! I have created CSV-304. I attached a patch to
> > the ticket but I don't know how to submit a pull request, please could
> you
> > advise?
> > Peter
> >
> > On Tue, 6 Sept 2022 at 11:37, Bruno Kinoshita  wrote:
> >
> > > Hi Peter,
> > >
> > > I think not keeping comments may help with memory management in cases
> > where
> > > you have an enormous amount of comments, or maybe speed up processing
> if
> > > you discard them? Not sure.
> > >
> > > But in any case, if you already have the patch working, I'd suggest 1)
> > > taking a look at the JIRA of CSV and searching for any open or closed
> > > issues similar to this one (I feel like I heard something similar
> before
> > > for Commons CSV), and then 2) creating an issue to the CSV component
> and
> > 3)
> > > prepare the pull request using a commit message like "[CSV-1234etc]
> > > Description...", and the PR title "[CSV-1234] Title..." . This way
> others
> > > can review your code and comment there. And having the JIRA will help
> > > future users with similar use cases in case it's not maintained, or if
> > > there's some other feature they are missing.
> > >
> > > Thanks
> > > -Bruno
> > >
> > > On Tue, 6 Sept 2022 at 20:31, Peter Hull 
> wrote:
> > >
> > > > Dear all,
> > > > I have an application where it would be useful to be able to get the
> > > > leading comments (ie. before the first record) from a CSV file.
> > > > I asked a question on StackOverflow[1] but I got no replies and as
> far
> > > as I
> > > > can see it's not possible.
> > > > I looked into implementing this myself and it appeared to be pretty
> > > > straightforward, since the CSV parser already pulls out the comments
> > but
> > > > then discards them. It was also straightforward to access trailing
> > > comments
> > > > too. I created a patch with the implementation and a test.
> > > > Would there be any interest from the commons-csv developers in this
> > > patch?
> > > > I appreciate there may be reasons I am not aware of as to why
> > commons-csv
> > > > doesn't do this already.
> > > > Thanks,
> > > > Peter
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> https://stackoverflow.com/questions/72619095/get-leading-comments-from-csv-with-apache-commons-csv
> > > >
> > >
> >
>


Re: [CSV] New feature to allow access to leading/trailing comments in CSV files?

2022-09-06 Thread Peter Hull
Hi Bruno,
Thanks for the swift reply! I have created CSV-304. I attached a patch to
the ticket but I don't know how to submit a pull request, please could you
advise?
Peter

On Tue, 6 Sept 2022 at 11:37, Bruno Kinoshita  wrote:

> Hi Peter,
>
> I think not keeping comments may help with memory management in cases where
> you have an enormous amount of comments, or maybe speed up processing if
> you discard them? Not sure.
>
> But in any case, if you already have the patch working, I'd suggest 1)
> taking a look at the JIRA of CSV and searching for any open or closed
> issues similar to this one (I feel like I heard something similar before
> for Commons CSV), and then 2) creating an issue to the CSV component and 3)
> prepare the pull request using a commit message like "[CSV-1234etc]
> Description...", and the PR title "[CSV-1234] Title..." . This way others
> can review your code and comment there. And having the JIRA will help
> future users with similar use cases in case it's not maintained, or if
> there's some other feature they are missing.
>
> Thanks
> -Bruno
>
> On Tue, 6 Sept 2022 at 20:31, Peter Hull  wrote:
>
> > Dear all,
> > I have an application where it would be useful to be able to get the
> > leading comments (ie. before the first record) from a CSV file.
> > I asked a question on StackOverflow[1] but I got no replies and as far
> as I
> > can see it's not possible.
> > I looked into implementing this myself and it appeared to be pretty
> > straightforward, since the CSV parser already pulls out the comments but
> > then discards them. It was also straightforward to access trailing
> comments
> > too. I created a patch with the implementation and a test.
> > Would there be any interest from the commons-csv developers in this
> patch?
> > I appreciate there may be reasons I am not aware of as to why commons-csv
> > doesn't do this already.
> > Thanks,
> > Peter
> >
> > [1]:
> >
> >
> https://stackoverflow.com/questions/72619095/get-leading-comments-from-csv-with-apache-commons-csv
> >
>


[CSV] New feature to allow access to leading/trailing comments in CSV files?

2022-09-06 Thread Peter Hull
Dear all,
I have an application where it would be useful to be able to get the
leading comments (ie. before the first record) from a CSV file.
I asked a question on StackOverflow[1] but I got no replies and as far as I
can see it's not possible.
I looked into implementing this myself and it appeared to be pretty
straightforward, since the CSV parser already pulls out the comments but
then discards them. It was also straightforward to access trailing comments
too. I created a patch with the implementation and a test.
Would there be any interest from the commons-csv developers in this patch?
I appreciate there may be reasons I am not aware of as to why commons-csv
doesn't do this already.
Thanks,
Peter

[1]:
https://stackoverflow.com/questions/72619095/get-leading-comments-from-csv-with-apache-commons-csv