Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread ssz
> I already commented elsewhere on why this is not a good fit for IO My bad, I did not realise that discussion is closed with final resolution. I read "what do others think?" and thought that it is about continuing discussion. > Instead of keeping on arguing to shove your library in IO I

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread Gilles Sadowski
Le jeu. 20 juil. 2023 à 15:18, Gary Gregory a écrit : > > [...] Instead of keeping on arguing to shove your library in [...] If we could stop the brutal language... (?) The OP asked politely, and was ready to wait indefinitely (unsubscribing from this ML) for an answer; I just wanted to make

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread Gary Gregory
I already commented elsewhere (can't recall if it was on this list or github) on why this is not a good fit for IO. Too much like a database operation, IO is a lower level library, and so on. IO is not a kitchen sink for anything related to IO. Like Lang, it was initially conceived as a library

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread ssz
> Which projects (ASF?) depend on your proposed contribution? Currently only business projects, not opensource. I'm thinking about RDF-Graph backed by FS. If I implement this solution I will raise an issue with the Apache Jena team. The original library will probably support multiplatform, in

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread ssz
For sure, you can sort in-memory, no problem here. I think we need to find a well-known library with non-in-memory sorting and binary searching, it is better relatively new (java8+) On Thu, Jul 20, 2023 at 3:08 PM ssz wrote: > That's great! > - But ANT is quite an ancient system, and it is now

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread ssz
That's great! - But ANT is quite an ancient system, and it is now relatively unknown. - And it is relatively heavy. Maybe it's better to have single-function in the dedicated library or in well-known library with other useful features - It uses in-memory sorting:

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread Gary Gregory
Note that Apache Ant already provides similar functionality: https://ant.apache.org/manual/api/org/apache/tools/ant/filters/SortFilter.html Gary On Thu, Jul 20, 2023, 07:38 Gilles Sadowski wrote: > Hi. > > [Disclaimer: I'm not a user nor a developer of "Commons IO", so > I'm not the most

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread Gilles Sadowski
Hi. [Disclaimer: I'm not a user nor a developer of "Commons IO", so I'm not the most suitable for entertaining this conversation and, surely, I shouldn't be the only one...] Le jeu. 20 juil. 2023 à 10:33, ssz a écrit : > > Hi > Sure, I will support my code. > I have a lot of other opensource

Re: [commons-io] question: file content merge sort and binary search

2023-07-20 Thread ssz
Hi Sure, I will support my code. I have a lot of other opensource projects, not so much free time. But this code will have the highest priority as Commons is used by thousands of developers. My other projects are used by hundreds of people. On Wed, Jul 19, 2023 at 8:28 PM Gilles Sadowski wrote:

Re: [commons-io] question: file content merge sort and binary search

2023-07-19 Thread Gilles Sadowski
Hi. Le mar. 18 juil. 2023 à 19:06, ssz a écrit : > > [...] > > We use this library as a second-level cache when parsing CIMXML RDF, this > file-based cache contains triples, and also subject-type pairs (RDF nodes). > It is not csv. > Also, I'm thinking about RDF-Graph implementation backed by

Re: [commons-io] question: file content merge sort and binary search

2023-07-19 Thread ssz
I added some additional details to README.md Please let me know if I can add something for more understanding. On Tue, Jul 18, 2023 at 7:25 PM Gilles Sadowski wrote: > Hello. > > Le mar. 18 juil. 2023 à 17:35, ssz a écrit : > > > > here >

Re: [commons-io] question: file content merge sort and binary search

2023-07-18 Thread ssz
I thought everything is described in sufficient detail in the documentation and project's README ... Obviously it is not so clear, my fault, sorry. Probably I should consider adding more explanation to README.md. But I thought that merge-sorting and binary-search are well-known algorithms, and we

Re: [commons-io] question: file content merge sort and binary search

2023-07-18 Thread Gilles Sadowski
Hello. Le mar. 18 juil. 2023 à 17:35, ssz a écrit : > > here https://github.com/sszuev/textfile-utils-examples/tree/master/src/test Yes, this shows the API and its usage, but I was also wondering about actual uses. What kind of applications would need to call this functionality from Java?

Re: [commons-io] question: file content merge sort and binary search

2023-07-18 Thread ssz
here https://github.com/sszuev/textfile-utils-examples/tree/master/src/test On Tue, Jul 18, 2023 at 12:03 PM Gilles Sadowski wrote: > Hello. > > Le mar. 18 juil. 2023 à 10:50, ssz a écrit : > > > > Hello there > > > > I see this issue on hold. > > So far, no one else has an opinion on this

Re: [commons-io] question: file content merge sort and binary search

2023-07-18 Thread Gilles Sadowski
Hello. Le mar. 18 juil. 2023 à 10:50, ssz a écrit : > > Hello there > > I see this issue on hold. > So far, no one else has an opinion on this issue. Maybe "Commons Text"? It would help to see use-cases and API examples (in Java). Regards, Gilles > I'm going to unsubscribe from this list for

Re: [commons-io] question: file content merge sort and binary search

2023-07-18 Thread ssz
Hello there I see this issue on hold. So far, no one else has an opinion on this issue. I'm going to unsubscribe from this list for a while. Please email me directly in case of a positive final decision. sss.zuev {at} gmail / com Thanks! On Mon, Jul 10, 2023 at 12:17 AM Gary Gregory wrote: >

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread ssz
A few remarks: -- I think, (CSV/Java) Stream API is not suitable directly, it is difficult to implement sorting just using streams. If you can suggest a simpler solution, I will really appreciate it because the simpler the code, the better. In the library, of course, streams (channels) and

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread Gary Gregory
I've thought about this a little more and it seems to me that sorting and searching through any old binary file does not fit the remit of Commons IO or CSV. If anything it would be a new component, but it feels like the kind of database operations that do not fit in Commons. What do others think?

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread Gary Gregory
Commons CSV supports the Java Streaming API so you can do whatever that API offers, including filtering, sorting, finding, and so on. More than plain CSVs are supported, and I encourage you to peruse the site https://commons.apache.org/proper/commons-csv/ If you think that component can be

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread ssz
Does common-csv support **sorting** large? Does it support binary search? What should I do if I have a non-csv text file? Actually I didn't say that textfile-utils is a library for working with csv files. I just provided you with an example. On Sun, Jul 9, 2023 at 8:23 PM Gary Gregory wrote:

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread Gary Gregory
If the intent is to process CSV files, you're missing quite parameters in order to process all of the different CSV flavors, see Apache Commons CSV. Gary On Sun, Jul 9, 2023, 13:16 ssz wrote: > text-files sort. e.g. CSV. > > Example: > content: `d,420;b,42;b,21;a;21;c;"42"`, delimiter ';' >

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread ssz
More example (in code): (sort) https://github.com/DataFabricRus/textfile-utils/blob/main/src/test/kotlin/MergeSortTest.kt#L202 (search) https://github.com/DataFabricRus/textfile-utils/blob/main/src/test/kotlin/BinarySearchTest.kt#L20 On Sun, Jul 9, 2023 at 8:14 PM ssz wrote: > text-files sort.

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread ssz
text-files sort. e.g. CSV. Example: content: `d,420;b,42;b,21;a;21;c;"42"`, delimiter ';' after sort by prefix: `a:21;b,42;b,21;c:"42";d,420` binary search by prefix `b`: `b,42;b,21` The project is completed with tests and documentation. It is open source. Github:

Re: [commons-io] question: file content merge sort and binary search

2023-07-09 Thread Gary Gregory
Hello, This seems to be me like a mismatch with Commons IO. What does it even mean to "sort" a file which are really a bunch of bytes. Do you have a relevant example (Java base)? This feels more like a database primitive to me. What am I missing? Gary On Sun, Jul 9, 2023, 10:42 ssz wrote: >

[commons-io] question: file content merge sort and binary search

2023-07-09 Thread ssz
It seems to be well-known and generic functionality, so it would be nice to have it in some well-known common place. Is *apache/commons-io* this place? Here is the draft: https://github.com/DataFabricRus/textfile-utils This is my library made for DataFablic, it is written on kotlin with