Hey GP,

yes, this is strange.

I'll ask BBEdit-support about it, perhaps they could hint to some differences that I don't get.
I'll come back to this.

Thanks again!


Regards,
Vlad


---

On 8 Apr 2025, at 4:25, GP wrote:

Hmm... This is really puzzling. Sub-pattern line sorting works for me but
for some unknown reason not for you.

Using your new sample data set:

MANDT;BU;IDENTIFIER;OBJNR;ADRC_ADDRNUMBER;ADRC_COUNTRY;ADRC_REGION;ADRC_POST_CODE1;ADRC_CITY1;ADRC_CITY_EXT;ADRC_CITY2;ADRC_STREET;ADRC_HOUSE_NUM1;ADRC_HOUSE_NUM2;LOKAREF_COUNTRY;LOKAREF_REGION;LOKAREF_POST_CODE1;LOKAREF_CITY1;LOKAREF_CITY_CODE;LOKAREF_CITY_EXT;LOKAREF_CITY2;LOKAREF_CITYP_CODE;LOKAREF_STREET;LOKAREF_STRT_CODE;LOKAREF_HOUSE_NUM1;LOKAREF_HOUSE_NUM2;COUNTRY_KZ;REGION_KZ;POST_CODE1_KZ;CITY1_KZ;CITY_EXT_KZ;CITY2_KZ;STREET_KZ;ADR_CHK_KZ;MSGNO;MESSAGE
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723592;DE;09;86415;Mering;;Sankt
Afra;Egerländer Straße;;;DE;09;86415;Mering;500000002795;, Schwab;Sankt
Afra;00000006;Egerländerstraße;910011919800;;;0;0;0;0;1;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723918;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723657;DE;09;85655;Aying;;Kaps;Kaps;;;DE;09;85653;Aying;500000002262;;Kaps;00000010;Kaps;700055566100;;;0;0;1;0;3;0;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723878;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723658;DE;09;83083;Riedering;;Patting;Patting;;;DE;09;83083;Riedering;500000002552;b
Rosenheim, Oberbay;Patting;00000037;Pattinger
Straße;910003809300;;;0;0;0;0;1;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723674;DE;09;85655;Aying;;Großhelfendorf;Hirschbergstraße;;;DE;09;85653;Aying;500000002262;;Großhelfendorf;00000007;Hirschbergstraße;910002873200;;;0;0;1;0;3;0;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723908;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724554;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723956;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724593;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;

and for the Sort Lines ... "Sort using pattern"'s "Searching pattern:" of:

\d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*

With "Specific sub-patterns:" of:

\8\1\2\3\4\5\6\7

and "Sorted lines to new document" result in a new document containing:

MANDT;BU;IDENTIFIER;OBJNR;ADRC_ADDRNUMBER;ADRC_COUNTRY;ADRC_REGION;ADRC_POST_CODE1;ADRC_CITY1;ADRC_CITY_EXT;ADRC_CITY2;ADRC_STREET;ADRC_HOUSE_NUM1;ADRC_HOUSE_NUM2;LOKAREF_COUNTRY;LOKAREF_REGION;LOKAREF_POST_CODE1;LOKAREF_CITY1;LOKAREF_CITY_CODE;LOKAREF_CITY_EXT;LOKAREF_CITY2;LOKAREF_CITYP_CODE;LOKAREF_STREET;LOKAREF_STRT_CODE;LOKAREF_HOUSE_NUM1;LOKAREF_HOUSE_NUM2;COUNTRY_KZ;REGION_KZ;POST_CODE1_KZ;CITY1_KZ;CITY_EXT_KZ;CITY2_KZ;STREET_KZ;ADR_CHK_KZ;MSGNO;MESSAGE
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723658;DE;09;83083;Riedering;;Patting;Patting;;;DE;09;83083;Riedering;500000002552;b
Rosenheim, Oberbay;Patting;00000037;Pattinger
Straße;910003809300;;;0;0;0;0;1;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723674;DE;09;85655;Aying;;Großhelfendorf;Hirschbergstraße;;;DE;09;85653;Aying;500000002262;;Großhelfendorf;00000007;Hirschbergstraße;910002873200;;;0;0;1;0;3;0;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723657;DE;09;85655;Aying;;Kaps;Kaps;;;DE;09;85653;Aying;500000002262;;Kaps;00000010;Kaps;700055566100;;;0;0;1;0;3;0;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723592;DE;09;86415;Mering;;Sankt
Afra;Egerländer Straße;;;DE;09;86415;Mering;500000002795;, Schwab;Sankt
Afra;00000006;Egerländerstraße;910011919800;;;0;0;0;0;1;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723918;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723878;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723908;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723956;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724554;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724593;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;

So, given the sample data and the specific sub-patterns (record fields) we're sorting on, it ends up being the ADRC_POST_CODE1 field value which
determines how the lines are sorted when I do it.

I also tried it with both "Sorted lines to new document" and "Sorted linees replace selection" options set as you have and I got my same above result. I also tried it with line endings set to "Windows (CRLF)" as you have it
and got the same above result.

Here's a screenshot of the Find Differences result:
[image: Compare_sort_lines.png]
See on the "Sorted Lines" side it is next and tidy sorted in the
ADRC_POST_CODE1 column/field.
On Monday, April 7, 2025 at 4:09:32 AM UTC-7 Vlad Ghitulescu wrote:

Hi GP,


First of all: I modified a bit the order of the lines in my sample to this

—


MANDT;BU;IDENTIFIER;OBJNR;ADRC_ADDRNUMBER;ADRC_COUNTRY;ADRC_REGION;ADRC_POST_CODE1;ADRC_CITY1;ADRC_CITY_EXT;ADRC_CITY2;ADRC_STREET;ADRC_HOUSE_NUM1;ADRC_HOUSE_NUM2;LOKAREF_COUNTRY;LOKAREF_REGION;LOKAREF_POST_CODE1;LOKAREF_CITY1;LOKAREF_CITY_CODE;LOKAREF_CITY_EXT;LOKAREF_CITY2;LOKAREF_CITYP_CODE;LOKAREF_STREET;LOKAREF_STRT_CODE;LOKAREF_HOUSE_NUM1;LOKAREF_HOUSE_NUM2;COUNTRY_KZ;REGION_KZ;POST_CODE1_KZ;CITY1_KZ;CITY_EXT_KZ;CITY2_KZ;STREET_KZ;ADR_CHK_KZ;MSGNO;MESSAGE
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723592;DE;09;86415;Mering;;Sankt
Afra;Egerländer Straße;;;DE;09;86415;Mering;500000002795;, Schwab;Sankt
Afra;00000006;Egerländerstraße;910011919800;;;0;0;0;0;1;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723918;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723657;DE;09;85655;Aying;;Kaps;Kaps;;;DE;09;85653;Aying;500000002262;;Kaps;00000010;Kaps;700055566100;;;0;0;1;0;3;0;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723878;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723658;DE;09;83083;Riedering;;Patting;Patting;;;DE;09;83083;Riedering;500000002552;b
Rosenheim, Oberbay;Patting;00000037;Pattinger
Straße;910003809300;;;0;0;0;0;1;0;1;1;;

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723674;DE;09;85655;Aying;;Großhelfendorf;Hirschbergstraße;;;DE;09;85653;Aying;500000002262;;Großhelfendorf;00000007;Hirschbergstraße;910002873200;;;0;0;1;0;3;0;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723908;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724554;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723956;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724593;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;

—

in order to have the 10 records NOT already sorted.

I've built then the grep piece by piece in BBEdit’s Pattern Playground as
you suggested

[image: CleanShot 2025-04-07 at 06.14.52.png]

and made only a minor change to the „*Replace pattern*" in order to still
have the semicolons (see above).
The grep selects every single line of the sample data with the exception
of the first - hurray!
That means that sorting the changed file will sort the lines as I wanted. After this it would only be necessary to put the columns in the initial
order.

Now that I know for sure 😉 that the grep works I wanted to get the „*Sort lines…*“ also working, so I put then your grep in the „*Sort lines…*“
again

[image: CleanShot 2025-04-07 at 06.17.42.png]

and checked also „*Sorted lines to new document*“.

As you see above the lines were still NOT sorted (see the column of the ADRC_POST_CODE1 marked in the screenshot above) after this… actually they not differ at all from the original, as comparing the two front windows
shows:

[image: CleanShot 2025-04-07 at 06.16.43.png]

Did I still miss something?


Regards,
Vlad







Am 28.03.2025 um 19:16 schrieb GP <gp-bbed...@hotmail.com>:

Your Pattern Playground results are perplexing. Using your first post's
example CSV data, the grep:



\d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*

results in every line but the first column labels line matching.

To figure out what the problem might be on your system with your local language configuration using either BBEdit's Pattern Playground or regex101 start out by building the grep pattern from scratch and rebuilding it from left to right by semicolon delineated field pattern parts. E.g., first \d{3}; which should find/highlight 7 matches in each line of the example CSV data - second add \w{3}; for a total grep of \d{3};\w{3}; which should
result in the leading 200;BAG; being highlighted for each line in the
example. Continue on like that until you find the next added semicolon delineated field pattern part fails to show a match for the left side part of each line in the example data. It'll be something in that line's or lines' field/column that isn't matching what the just add grep pattern
part's matching criteria is.

In addition to sorting, an additional use of a working grep pattern is that you can also use it with BBEdit's Text -> Process Lines Containing... to find all lines that do NOT contain that grep pattern which will help in finding malformed CSV data in the large CSV data files your working with.

On Friday, March 28, 2025 at 7:12:03 AM UTC-7 Vlad Ghitulescu wrote:

Hey GP


I corrected the error re „Specific sub-patterns:“ but this didn’t seem to
bring any change: The ADRC_POST_CODE1 is still not sorted

<CleanShot 2025-03-28 at 10.02.07.png>


The command gave also no recognizable sign that is ready, so I’m not sure that it didn’t have also problems with the line 25816, where the CRLF
follows a house-number (see previous emails).

BBEdit’s Pattern Playground shows however that there is no result after
searching with the regex

<CleanShot 2025-03-28 at 10.09.51.png>

I’ll take the regex to regex101 (thanks for the hint!) and see if I could
spot an error.



Regards,
Vlad




Am 26.03.2025 um 19:42 schrieb GP <gp-bbed...@hotmail.com>:

First, in your Sort Lines dialog screenshot, you need to select the
"Specific sub-patterns:" option instead of "Entire match" in order for the lines to be sorted by your column sorting criteria (MSGNO, ADRC_COUNTRY,
ADRC_REGION, ADRC_POST_CODE1, ADRC_CITY1, ADRC_CITY2, ADRC_STREET and
ADRC_HOUSE_NUM1). Since the sort lines grep pattern:


\d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*

will match every line in your example, using the "Entire match" option devolves the sort into a simple whole line string sort which would put the MSGNO (i.e. \8 in the example) column contents last instead of first in the sort order. (See the "Sort Lines" section in Chapter 5 of the BBEdit User
Manual for details of using sub-pattern sort ordering.)

With the "Entire match" option, if you look at every 2..> line the left part of each line is the same until you get to the part of the string with
the ADRC_ADDRNUMBER characters so the differences in that part of the
string is Sort Line's "Entire match" is using to determine the ordering of
the whole line strings.

Using the "Specific sub-patterns:" option is what allows you to specify what substring part(s) of a string/line and what composed ordering of those
concatenated substring will be used in determining the sort ordering
between whole strings/lines.

To see what's going on with Sort Lines' "Specific sub-patterns:" option
you can use BBEdit's Pattern Playground to see what the concatenated
substring for a line is being used to determine line sort ordering. For
"Search pattern:" put:


\d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*

and for "Replace pattern" put:

\8\1\2\3\4\5\6\7

and for "Contents of" chose an open example file.

As you step through each grep pattern match (using the Next button), the "Replacement text:" field will show you the concatenated string composed from the capture group ordered substring of the whole matched string/line. It is that "Replacement text:" string that Sort Lines uses for "Specific
sub-patterns:" option sorting evaluation.

P.S. If an explanation of what the parts of a grep regular expression is
specifying would help,  https://regex101.com has a pretty good
explanation panel that explains what each bit of a regular expression is
doing.
On Wednesday, March 26, 2025 at 6:24:57 AM UTC-7 Vlad Ghitulescu wrote:

Hey GP


And thanks for the suggestion!

I tried the sort-solution before trying to understand the regex itself 😶

I pasted into Text —> Sort Lines… like this

[image: CleanShot 2025-03-26 at 08.24.24.png]

but after Sort it doesn’t look like the postal code column was considered

[image: CleanShot 2025-03-26 at 08.25.19.png]

Did I miss something?

Thanks again!


Regards,
Vlad





Am 25.03.2025 um 22:32 schrieb GP <gp-bbed...@hotmail.com>:

As a follow up...

BBEdit's Pattern Playground is a great help in constructing tedious grep patterns like you'll need for your filtering and sorting needs. The really tedious part is getting the field position(s) you want to filter or sort on so you can modify that field's match pattern to conform to the desired
filter or sorting criteria.

For example... For your " Filter all lines that have ADR_CHK_KZ = 1" using
Text -> Process Lines Containing ... with the grep pattern:



\d{3};\w{3};[^;]*;[^;]*;\d{10};\w{2};\d{2};\d{5};[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;(1);[^;]*;[^\n]*

will do the trick. For filtering you don't need the group capturing on the 1 but it is useful with Pattern Playground to verify you're getting the
right field position and field contents matched.

For your "Sort the file by MSGNO, ADRC_COUNTRY, ADRC_REGION,
ADRC_POST_CODE1, ADRC_CITY1, ADRC_CITY2, ADRC_STREET and ADRC_HOUSE_NUM1"
using Text -> Sort Lines ... with a grep pattern of:


\d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*

with "Specific sub-patterns" selected with \8\1\2\3\4\5\6\7 in the fill in
field will sort your example text using your desired field ordering.
On Tuesday, March 25, 2025 at 12:53:47 PM UTC-7 GP wrote:

For filtering, look at Text -> Process Lines Containing ... and for
sorting Text -> Sort Lines ... using grep patterns to identify what you want to match for filtering and what subpattern field or fields you want to
sort ordered on.

If the number of fields in your sample is representative of the real CSV files you're working with, it is going to be something of a pain in the rear coming up with the grep patterns needed to accomplish the desired
filtering and sorting.

On Tuesday, March 25, 2025 at 11:03:35 AM UTC-7 Vlad Ghitulescu wrote:

Hey,


I use BBEdit very often while working with big CSV-files (300 - 500 MB, up
to 4 million rows) looking like this:


MANDT;BU;IDENTIFIER;OBJNR;ADRC_ADDRNUMBER;ADRC_COUNTRY;ADRC_REGION;ADRC_POST_CODE1;ADRC_CITY1;ADRC_CITY_EXT;ADRC_CITY2;ADRC_STREET;ADRC_HOUSE_NUM1;ADRC_HOUSE_NUM2;LOKAREF_COUNTRY;LOKAREF_REGION;LOKAREF_POST_CODE1;LOKAREF_CITY1;LOKAREF_CITY_CODE;LOKAREF_CITY_EXT;LOKAREF_CITY2;LOKAREF_CITYP_CODE;LOKAREF_STREET;LOKAREF_STRT_CODE;LOKAREF_HOUSE_NUM1;LOKAREF_HOUSE_NUM2;COUNTRY_KZ;REGION_KZ;POST_CODE1_KZ;CITY1_KZ;CITY_EXT_KZ;CITY2_KZ;STREET_KZ;ADR_CHK_KZ;MSGNO;MESSAGE

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723592;DE;09;86415;Mering;;Sankt
Afra;Egerländer Straße;;;DE;09;86415;Mering;500000002795;, Schwab;Sankt
Afra;00000006;Egerländerstraße;910011919800;;;0;0;0;0;1;0;1;1;;

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723657;DE;09;85655;Aying;;Kaps;Kaps;;;DE;09;85653;Aying;500000002262;;Kaps;00000010;Kaps;700055566100;;;0;0;1;0;3;0;0;1;;

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723658;DE;09;83083;Riedering;;Patting;Patting;;;DE;09;83083;Riedering;500000002552;b
Rosenheim, Oberbay;Patting;00000037;Pattinger
Straße;910003809300;;;0;0;0;0;1;0;1;1;;

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723674;DE;09;85655;Aying;;Großhelfendorf;Hirschbergstraße;;;DE;09;85653;Aying;500000002262;;Großhelfendorf;00000007;Hirschbergstraße;910002873200;;;0;0;1;0;3;0;0;1;;

200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723878;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723908;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723918;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723956;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
Straße;910001339100;;;0;0;0;0;3;0;1;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724554;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724593;DE;09;95131;Schwarzenbach
a.Wald;;Schwarzenbach a
Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
a.Wald;500000011836;;Schwarzenbach
a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;

Once in a while I’d like to filter or sort such huge files by one or more
columns, like:

1. Filter all lines that have ADR_CHK_KZ = 1 or
2. Sort the file by MSGNO, ADRC_COUNTRY, ADRC_REGION, ADRC_POST_CODE1,
ADRC_CITY1, ADRC_CITY2, ADRC_STREET and ADRC_HOUSE_NUM1.

Is there a way to do this sort of tasks with BBEdit?

Thanks!


Regards,
Vlad




--
This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on
Mastodon: <https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to bbedit+un...@googlegroups.com.

To view this discussion visit
https://groups.google.com/d/msgid/bbedit/50130484-14eb-4298-b762-800f88b2c66en%40googlegroups.com
<https://groups.google.com/d/msgid/bbedit/50130484-14eb-4298-b762-800f88b2c66en%40googlegroups.com?utm_medium=email&utm_source=footer>
.



--
This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on
Mastodon: <https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to bbedit+un...@googlegroups.com.

To view this discussion visit
https://groups.google.com/d/msgid/bbedit/3e139849-cf1a-41d8-821e-97f87cc39513n%40googlegroups.com
<https://groups.google.com/d/msgid/bbedit/3e139849-cf1a-41d8-821e-97f87cc39513n%40googlegroups.com?utm_medium=email&utm_source=footer>
.



--
This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "sup...@barebones.com" rather than posting here. Follow @bbedit on
Mastodon: <https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to bbedit+un...@googlegroups.com.

To view this discussion visit
https://groups.google.com/d/msgid/bbedit/a12981c7-c81f-44cb-9f7b-3ea64cd6c602n%40googlegroups.com
<https://groups.google.com/d/msgid/bbedit/a12981c7-c81f-44cb-9f7b-3ea64cd6c602n%40googlegroups.com?utm_medium=email&utm_source=footer>
.
<CleanShot 2025-03-28 at 10.02.07.png><CleanShot 2025-03-28 at
10.09.51.png>



--
This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "supp...@barebones.com" rather than posting here. Follow @bbedit on Mastodon: <https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to bbedit+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bbedit/6da51d2b-1b78-4847-8f45-a86e91cc30bbn%40googlegroups.com.

--
This is the BBEdit Talk public discussion group. If you have a feature request or believe that 
the application isn't working correctly, please email "supp...@barebones.com" rather 
than posting here. Follow @bbedit on Mastodon: <https://mastodon.social/@bbedit>
--- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bbedit+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/bbedit/5BCD5AE6-FC2B-425B-A77C-91A1A0E304DD%40Ghitulescu.de.

Reply via email to