from:"Mark Filipak"

Correction:
"if your sampling frequency exceeds Nyquist" was "if your sampling frequency exceeds Nyquist/2"
(which was an inadvertent mistake). Sorry.

On 10/04/2020 09:44 PM, Anatoly wrote:

On Sun, 4 Oct 2020 10:37:41 -0400

-snip-

Are you watched "Part 2" ?
https://www.youtube.com/watch?v=ht4Mv2wIRyQ

I don't think spacial image resolution is
related to frequency at all.

Then watch mentioned above video at 8:07
He didnt explain this, but those plots actually is so called frequency
responces (attenuation vs frequency).
What do you see on upper plot there?

What the upper graph shows is the frequency response of the serial bitstream from the photosite
array through the analog-to-digital converter. Nyquist does indeed apply to that, but only because
the camera forms the samples in a limited amount of time and the analog-to-digital converter is
multiplexed (shared) by the photosites -- Fo is the sampling frequency, so the resolution of the
analog-to-digital converter is artificially limited (filtered) to Fo/2. But that is not directly a
property of the picture's resolution. It's a property of the camera design that limits the camera's
resolution. Pictures are not made of sine waves. Coded pictures are DCT encoded but Nyquist has
nothing to do with that part because DCT is done entirely in the digital domain, after sampling, and
time is not a factor in the digital domain.

The sampling system limits resolution, yes. And that limit is Fo/2, yes. But pictures don't have a
Nyquist frequency, no matter what number of samples/line or lines/frame. And pictures are not made
of sine waves. You are getting lost in the camera design.

X axis is "spatial frequency / line pairs" as distance between two pairs
of lines of picture of alternating black and white lines.

(line pairs)/(picture height) -- I assume "line pairs" means the horizontal distance between
vertical lines -- is a pretty odd frequency scale, and calling that a spacial frequency is pretty
bogus. The bottom line is this: Nyquist applies to serial analog-to-digital conversion frequency,
not to the ultimate resolution. It says that if your sampling frequency exceeds Nyquist, you're
going to get aliasing. That's the kernel of the presentation and you are misinterpreting what is
being presented.

In other words, the Nyquist frequency is a function of each particular camera and the frequency at
which that camera does analog-to-digital conversion.

The premise that to get 720x480, frames should be sampled at 1440x960 is bogus. I've read that so
many times that I put it into the glossary without really thinking about it. But it's wrong.

Oh, and did I say that pictures are not made of sine waves?

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

On 10/04/2020 09:44 PM, Anatoly wrote:

On Sun, 4 Oct 2020 10:37:41 -0400

-snip-

Are you watched "Part 2" ?
https://www.youtube.com/watch?v=ht4Mv2wIRyQ

I don't think spacial image resolution is
related to frequency at all.

Then watch mentioned above video at 8:07
He didnt explain this, but those plots actually is so called frequency
responces (attenuation vs frequency).
What do you see on upper plot there?

X axis is "spatial frequency / line pairs" as distance between two pairs
of lines of picture of alternating black and white lines.

(line pairs)/(picture height) -- I assume "line pairs" means the horizontal distance between
vertical lines -- is a pretty odd frequency scale, and calling that a spacial frequency is pretty
bogus. The bottom line is this: Nyquist applies to serial analog-to-digital conversion frequency,
not to the ultimate resolution. It says that if your sampling frequency exceeds Nyquist/2, you're
going to get aliasing. That's the kernel of the presentation and you are misinterpreting what is
being presented.

In other words, the Nyquist frequency is a function of each particular camera and the frequency at
which that camera does analog-to-digital conversion.

The premise that to get 720x480, frames should be sampled at 1440x960 is bogus. I've read that so
many times that I put it into the glossary without really thinking about it. But it's wrong.

Oh, and did I say that pictures are not made of sine waves?

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist


On 10/04/2020 03:52 AM, Rodney Baker wrote:

On Sunday, 4 October 2020 7:13:20 ACDT Mark Filipak (ffmpeg) wrote:

On 10/03/2020 02:05 PM, Anatoly wrote:

On Sat, 3 Oct 2020 11:05:03 -0400


-snip-


You should learn than what spectrum is.


Oh, please. Be easy with me. I'm just a simple electrical engineer.


And how any complex waveform
(with it's "information density") may be represented as a sum of many
simple sinewaves.


Ah, now that would be a Taylor series, no? It's been about 4-1/2 decades but
I think it's a Taylor series.


[...]

Fourier, not Taylor. All images are made up of sine waves.


You're kidding, eh?

I just posted this: Demystifying Digital Camera Specifications (was Glossary: 
Nyquist)

You should read it.

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] Demystifying Digital Camera Specifications (was Glossary: Nyquist)

Demystifying Digital Camera Specifications

Here:
https://www.provideocoalition.com/panavision_posts_demystifying_digital_camera_specifications_videos/
is a blog by Jim Feeley that mentions a video presentation by John Galt of Panavision and Larry
Thorpe of Canon titled "Demystifying Digital Camera Specifications".

The link that Jim Feeley provides is dead. I've searched for a current link so that I could share it
here but I couldn't find the video.

I have it, either complete or as 7 parts.

20-08-10 19:04 397,590,171 Demystifying Digital Camera Specifications
.mkv
20-08-10 18:5176,493,280 Demystifying Digital Camera Specifications,
Pt 1 .mkv
20-08-10 18:5443,846,576 Demystifying Digital Camera Specifications,
Pt 2 .mkv
20-08-10 18:5646,831,671 Demystifying Digital Camera Specifications,
Pt 3 .mkv
20-08-10 18:57 106,638,385 Demystifying Digital Camera Specifications,
Pt 4 .mkv
20-08-10 18:5745,627,889 Demystifying Digital Camera Specifications,
Pt 5 .mkv
20-08-10 18:5828,933,720 Demystifying Digital Camera Specifications,
Pt 6 .mkv
20-08-10 18:5849,218,650 Demystifying Digital Camera Specifications,
Pt 7 .mkv

Anyone who wants one or all should contact me.
-OR-
Part 1 is on youtube: https://www.youtube.com/watch?v=gqq8QKMmtYg
I assume the other parts are there, too.

On 10/04/2020 05:00 AM, Anatoly wrote:

On Sat, 3 Oct 2020 21:22:38 -0400
"Mark Filipak (ffmpeg)" wrote:

Now write down temperature for every cell of your screen and draw x-Y
plot: temperature vs cell number in grid line.

That's sampling.

I think you'll agree that neither the screen nor the underlying heat
map are serial in nature. Oh, they're transported as a sort-of raster

That doesn't really matter. Now you have x-Y plot of some function and
you can process it mathematically as you wish.

I totally agree.

-- that's for sure -- but that's not how they're made and I don't
think that Fourier applies.

Then you must dont't think that you can JPEG compress your screen
image, because all JPEG/MPEG-like things works that way.

Compression and resolution aren't related. Compression that is lossy spoils resolution, but that
doesn't mean that there's a functional relationship between them.

You know, I'm going to remove reference to Nyquist. I don't think spacial image resolution is
related to frequency at all. I don't think that the photons that fall on one pixel affect the
photons that fall on nearby pixels in any way -- I'm discounting quantum mechanics for the
pixel-to-pixel distances involved. That 'said', if the resulting image is scanned, rastered into
lines of pixel values and sent as a serial analog signal, then Nyquist definitely applies. But
that's not what's happening in a CCD or in the human eye.

There is an exception. If a CCD's photosensors are 'read' and analog-to-digital converted serially
(one photosensor at a time) -- and that is probably the case -- then Nyquist most definitely
applies, but it applies to the analog-to-digital conversion, not to some fictional analog
'frequency' within the image.

I think people have taken a temporal channel concept (Nyquist) and have tried to stretch it to fit a
spacial situation. I don't buy it. Photons that are broadside loaded into a camera (or into an eye)
is not a serial stream (i.e. not a raster). Nyquist doesn't apply.

Thank you. Do you think I should just post the whole thing? I can't.

Not here, but maybe on github?

Not github. I'm a human being. Github is for Martians.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

On 10/03/2020 05:12 PM, Michael Koch wrote:

Am 03.10.2020 um 22:43 schrieb Mark Filipak (ffmpeg):

On 10/03/2020 02:05 PM, Anatoly wrote:

On Sat, 3 Oct 2020 11:05:03 -0400

-snip-

You should learn than what spectrum is.

Oh, please. Be easy with me. I'm just a simple electrical engineer.

And how any complex waveform
(with it's "information density") may be represented as a sum of many
simple sinewaves.

Ah, now that would be a Taylor series, no?

Joseph Fourier just turned around in his grave...

Michael

Well, I've done some reading and it appears I need to tuck my tail between my legs and slink away.
From what I've read, Nyquist sampling is in the frequency domain, but I honestly don't see how
Fourier applies to spacial resolution, even given that MTF is used to characterize CCD and CMOS
response. Bear with me now...

Here's what I visualize:
Imagine a heat map -- one of those colorful images ...reds and yellows and greens and blues. Then,
imagine a screen in front of it, between you and the heat map. The screen is the final samples (ex:
720x480). The underlying heat map could also be 720x480, but if it is of higher resolution (ex:
1440x960), then the slope of color changes in the heat map are more precise and the positions of the
true colors that show through the screen (manifested as more accurate values) are slightly more
accurate. It's equivalent to rescreening a screened print image (ex: 200 dpi rescreened to 100 dpi).

I think you'll agree that neither the screen nor the underlying heat map are serial in nature. Oh,
they're transported as a sort-of raster -- that's for sure -- but that's not how they're made and I
don't think that Fourier applies.

The reason I cite a heat map rather than an ordinary image is that it makes it easier to visualize
the image formed as an energy distribution; an energy map if you will -- heat is energy.

The physical response to energy is not instantaneous. Responding to energy change requires time, and
looking up & down or left & right, for example, exposes your eyes to changing energy. So it's a
2-step process: 1, the response of the sampling process to the underlying heat map, and 2, the
response of your eyes to the content of the samples (i.e. the screen). That's what I meant by double
gaussian.

Why gaussian? Every real case of energy transfer that I've seen (outside of quantum mechanics) is
ruled and regulated by gaussian factors, from fluid flow to capacitor charge. Is the exponential
function gaussian. Gosh, I think so, but I can't find any justification right now.

I've consulted my copy of Feynman but he didn't have anything to say about
Nyquist.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL

On 10/03/2020 08:09 PM, Jim DeLaHunt wrote:

On 2020-10-03 08:44, Mark Filipak (ffmpeg) wrote:
-snip-> When you say, "My goal (for a long time) is to differentiate hard telecine from pseudo NTSC
(which
I'm calling d-NTSC).… [using] MPEG-PS metadata", it sounds to that your goal is to describe
different content structures in the context of an MPEG-PS stream.

That is exactly what I'm doing. I've been (manually) parsing a lot of video sequences (meaning the
stream beginning with 'sequence_header_code': 0x00 00 01 B3, and ending with 'sequence_end_code':
(0x00 00 01 B7 -- I'm not interested in the transport packages, though I've parsed them, too)
looking for clues to formats. I spent over a month just figuring out macroblock structures.

The right document for doing this
work is a guide to or explanation of MPEG-PS stream contents. As part of describing a content
structure, it is probably quite helpful to list the metadata values which identify that structure.
But this document is not a glossary.

Why not? (That question is rhetorical ... I appreciate that you have a right to
your own opinion.)

It also sounds to me like you are coining the term "d-NTSC" to name one kind of content structure.
It is perfectly in scope to define names in such a guide or explanation. But it sounds like you
aren't claiming that the term "d-NTSC" is [also] defined by some other document, such as the H.262
specification. Fine.

H.262 (and presumably MPEG) don't name things. For example, H.262 refers to d-NTSC & d-PAL (i.e.
scan-frames) by citing metadata thusly: "If progressive_frame is set to 0 it indicates that the two
fields of the frame are interlaced fields in which an interval of time of the field period [1]
exists between (corresponding spatial samples) of the two fields." -- how cumbersome! I'm just
assigning names to the 30/1.001 Hz & 25 Hz versions.

In the glossary, I would expect to see a term, e.g. "d-NTSC", and then one or more entries
describing meanings of that term, each with an explanation of the term and a cross-reference to
where the term is defined or used in an important way, e.g. to "Mark's Guide to MPEG-PS Stream
Content Structures", section X.Y, "pseudo-NTSC content".

Or simply put, what you are drafting in this thread is an entry in Mark's Guide to MPEG-PS, not a
glossary entry. In my humble opinion.

So, I take it that, to you, a glossary is a snack whereas a meal must be some sort of treatise and
that you think a meal is required. I disagree, but maybe you're right.

Perhaps a presentation of my motives is in order? -- I DO have an axe to grind.
:-)

Treatises drive me nuts. I better understand a complicated subject by hacking between and among
concise definitions. I rarely read treatises because they always seem to explain by citing use
cases. With each use case, the architecture comes into better focus, but it does take relearning
over and over and that takes so much time. I'm a computer system architect. Kindly just give me the
structure and the procedure and I'll put it together. I don't need use cases. (Code would probably
be sufficient, but I don't know 'C'.)

When presented with a treatise, what I do is scan it -- I never exhaustively read it -- and build a
glossary. Then, to really understand the topic, I scan the glossary, pulling threads together in my
mind until I've formed an architecture. Then I test the architecture against the treatise's use
cases. I don't think I'm alone in this. In the case of ffmpeg, everything seems to be use cases and
it drives me postal.

--
--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist


On 10/03/2020 05:12 PM, Michael Koch wrote:

Am 03.10.2020 um 22:43 schrieb Mark Filipak (ffmpeg):

On 10/03/2020 02:05 PM, Anatoly wrote:

On Sat, 3 Oct 2020 11:05:03 -0400

-snip-

You should learn than what spectrum is.


Oh, please. Be easy with me. I'm just a simple electrical engineer.


And how any complex waveform
(with it's "information density") may be represented as a sum of many
simple sinewaves.


Ah, now that would be a Taylor series, no?


Joseph Fourier just turned around in his grave...


Is Nyquist a consequent of Fourier or Taylor?

The factor in spacial resolution is not sampling frequency or whether the sampling clock is a square 
wave or any particular wave. It's geometric. It's the geometry of the energy that reaches the eye 
and the eye's response (rods & cones). I think the issue is one of gaussian response, an issue of slope.


The gaussian rules the world of analog to digital conversion just as it rules the world of particle 
interactions. I don't see Fourier as applying to that.


--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

On 10/03/2020 02:05 PM, Anatoly wrote:

On Sat, 3 Oct 2020 11:05:03 -0400

-snip-

You should learn than what spectrum is.

Oh, please. Be easy with me. I'm just a simple electrical engineer.

And how any complex waveform
(with it's "information density") may be represented as a sum of many
simple sinewaves.

Ah, now that would be a Taylor series, no? It's been about 4-1/2 decades but I think it's a Taylor
series.

Then you'll understand that all that may be simplified to the
picture I draw, and to that the definition of Nyquist-Shannon theorem
literally states (again):
"If a function x(t) contains no frequencies higher than B hertz, it is
completely determined by giving its ordinates at a series of points
spaced 1/(2B) seconds apart.
A sufficient sample-rate is therefore anything larger 2B samples per
second."

Again, you bring up signals and sample rate. A video frame is not a signal. A camera or a film
scanner has a sample rate, but that sample rate is not bandwidth limited -- i.e. there is no
(realistic) limit to the analog image frequency, so 2B samples is meaningless. Even for a single,
static picture, Nyquist still applies.

Sorry, but I don't see that you disagree with my glossary entry:

Nyquist sampling: The principle [1] that, to most faithfully reproduce an image at a given digital
display's resolution, samples must be made at or above twice the display's resolution, both
horizontally & vertically.
[1] The Nyquist principle applies to film sampling and to digital cameras, but, provided that
resolution is unchanged, not to transcoding (because the transcoder inputs are already digital). As
proved by the improved sharpness of SD media made from 2K flim samples, SD mastering prior to the
advent of 2K sampling (e.g. DVDs mastered from film before the advent of HD) generally ignored the
Nyquist principle and were undersampled. HDs sampled at 2K and 4K UHDs are likewise undersampled.

Maybe it's a fun to discuss such a things, but I think here is not
right place to do it, beacuse it has no straight relation to ffmpeg
usage.

If not ffmpeg.org, then where? doom9.org? -- no organization there, a
glossary would get lost. Or Wikipedia? Ha!

I really don't know. Maybe because of my prsonal approach that is to
create my own resources for my own projects, then just link to it.

The audience is here. ...Perhaps Wikipedia some day.

Then I may wish you to show worthy draft of your project to audience
before the audience gets completely bored. Good luck!

Thank you. Do you think I should just post the whole thing? I can't.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL

On 10/01/2020 10:42 PM, Jim DeLaHunt wrote:

On 2020-10-01 15:37, Mark Filipak (ffmpeg) wrote:

On 2020-10-01 15:30, Jim DeLaHunt wrote:

It is an improvement that you are now stating the context, "MPEG-PS binary metadata values". You
omitted that context before. But you continue to put #2 in the glossary entry, and I continue to
be of the opinion that the glossary is the wrong place for the content. The details of the
table formatting of #2 is a side issue in this discussion.

What is #2?

"#1" and "#2" is your notation, in a message I quoted in my reply. You elided that quote in your
reply.

Ah! Thank you, Jim. ...senior moment :-)

I INTERRUPT THIS PROGRAM FOR AN IMPORTANT ANNOUNCEMENT.

The text two paragraphs below was my original response. I include it just for the sake of
completeness and for Jim's benefit, however, things have... progressed (in a way).

My goal (for a long time) is to differentiate hard telecine from pseudo NTSC (which I'm calling
d-NTSC). I thought I'd found the differentiation: The combined values of 'progressive_sequence' and
'progressive_frame' MPEG-PS metadata. I was wrong. The video that I thought was hard telecined was
actually soft telecined. When I realized my error, I revised my response:

"'progressive_sequence' = 0 & 'progressive_frame' = 1 means that the frame is soft
telecined",
and only then realized that I'd screwed the pooch: that I still had no way to differentiate hard
telecine from d-NTSC. I'm withdrawing the d-NTSC & d-PAL entries and will rework them when I can, in
fact, differentiate d-NTSC & d-PAL from hard telecine. (sigh!)

WE NOW RESUME OUR REGULARLY SCHEDULED PROGRAM.

If I understand your point, it is that the definition is followed by a table (of sorts) that you
think is content and not a suitable part of the definition. Okay, let me try to explain and maybe
you can suggest a better way, eh?

The 'table' isn't really a table. It's the metadata values that are necessary for the d-NTSC frame
to be a d-NTSC frame. I provide it so that readers can verify that, "Yes, indeed, I'm looking at a
d-NTSC frame". The big clue is that 'progressive_sequence' (from the sequence_extension metadata
structure) and 'progressive_frame' (from the picture_coding_extension metadata structure) are both
zero. My friend, it took me months to figure that out because H.262 doesn't put things together.
Here's the straight dope:

'progressive_sequence' = 1 means the frame is a picture.
'progressive_sequence' = 0 & 'progressive_frame' = 0 means that the frame is
d-NTSC or d-PAL.
'progressive_sequence' = 0 & 'progressive_frame' = 1 means that the frame is
soft telecined.

Without that info, you can't tell d-NTSC from hard telecined unless you single step through the
video frames and know what to look for.

The other metadata in the 'table' is the rest of the stuff that distinguishes d-NTSC from d-PAL:
width, height, aspect ratio, etc.

So, you see, the metadata has to be part of the definition, or at least that's
what I think.

Do you have a better idea?

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist


On 10/03/2020 06:41 AM, Anatoly wrote:

On Fri, 2 Oct 2020 20:47:57 -0400
"Mark Filipak (ffmpeg)"  wrote:

-snip-

By the way, I've given up trying to make an illustration of
2-dimensional Nyquist sampling. It's too hard.

I think is's easy. Just slale dows to every one dimesion to tart from.
Lets draw XY plot of one line of our picture of alternating black-white
stripes

Voltage ^
   -or-  |
Light   |
intencity   | bwbw
 | ___   ___
 |/   \ /   \
 |___/ \___/ \_
 |___> Time -or- position

 --||||--- samples

   ____
  / \__/ \__/ \__/ \_  sampling freq -or- distance.

Here we are digitizing 4 pixels. Does not matter how they are separated
one from another - temporarily (analogue video signal) or spatialy
(laying on CCD silicone surface). Nyquist criteria says that to
digitize (somehow) 4 pixels we need to take 4 samples. Note that
our "signal" frequency (again, temporal or spatial) is 1/2 of sampling
frequency. That is it.


Where's the twice the display resolution in your diagram?

My understanding of Nyquist is limited. I think that it's based on the information density present 
in a signal having amplitude S, that transitions from S to S+d(S) (not black to white) and that it 
therefore defines a minimal slope (hence, the connection to bandwidth). I, myself, question that 
bandwidth is an adequate metric and whether 'information' is adequately characterized, but science 
only 'sees' what it can measure, eh? I'll stick with a definition based on energy density (which, in 
the listening and the seeing, has a gaussian profile and is based on physics).



Maybe it's a fun to discuss such a things, but I think here is not
right place to do it, beacuse it has no straight relation to ffmpeg
usage.


If not ffmpeg.org, then where? doom9.org? -- no organization there, a glossary would get lost. Or 
Wikipedia? Ha!


The audience is here. ...Perhaps Wikipedia some day.

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

2020-10-02 Thread Mark Filipak (ffmpeg)

On 10/02/2020 02:15 PM, Eduardo Alarcón wrote:

El vie., 2 oct. 2020 a las 7:34, Anatoly ()
escribió:

On Thu, 1 Oct 2020 20:25:30 -0400
"Mark Filipak (ffmpeg)" wrote:

When sampling an analog voltage, resolution is the ability to resolve
voltage value within a certain period of time (i.e. within a given
channel bandwidth). When sampling a visual field of view however,
resolution is the ability to resolve stationary edges that vary
spacially, going from light to dark or dark to light. It's the same
gaussian energy transfer issue (i.e. that transferring energy
requires time) with the same signal-to-noise issues and the same
handy half-power shorthand, but it applies to ... wait for it ...
human eyes! Human eyes resolve edges only so good, even totally black
abutting totally white. There is nothing you can do about that, and
staring at the edge doesn't bring it into higher resolution. However,
if the image source itself has fuzzy edges because it was sampled at
lower than Nyquist, then the result in our brains is a double
gaussian, the first from the CCD and the second from our eyes. It's
that double gaussian that is avoided by spacially sampling at higher
than 2x the display resolution.

I think this is wrong, Nyquist theorem or principles apply to sampling of a
signal, nothing to do with eyes or brain, ...

Not correct. It's not biological, but it does apply to biology, specifically,
to the eyes.

Okay, 2 thought experiments:
1 - Imagine a film scanner sampling a film frame line by line. Isn't the scanner making a signal
that the sampler uses to make samples? If you think that Nyquist applies only to signals, then,
there's your signal.
2 - What about a CCD array that makes all the samples at one time? Doesn't that expand the signal to
2 dimensions?

... it describes the minimum sampling rate ...

Nyquist has nothing to do with rate. If Wikipedia says otherwise, then Wikipedia is wrong. Rate only
applies to broadcast media like television. Rate determines bandwidth needed (which may be more than
what's allowed for channels), but bandwidth is meaningless in a film scanner or a camera because
they are not broadcast.

What Wikipedia may be referring to is the bandwidth needed for digital TV. That really has nothing
to do with Nyquist. But then, Wikipedia isn't written by experts, is it? I can see how it would
mislead you.

By the way, I've given up trying to make an illustration of 2-dimensional Nyquist sampling. It's too
hard.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

2020-10-02 Thread Mark Filipak (ffmpeg)

On 10/02/2020 06:34 AM, Anatoly wrote:

On Thu, 1 Oct 2020 20:25:30 -0400
"Mark Filipak (ffmpeg)" wrote:

On 10/01/2020 07:43 PM, Anatoly wrote:

On Wed, 30 Sep 2020 19:21:59 -0400
"Mark Filipak (ffmpeg)" wrote:

Nyquist [adjective]: 1, Reference to the Nyquist-Shannon sampling
theorem. 2, The principle [1] that, to most faithfully
reproduce an image at a given digital display's resolution, the
samples must be made at or above twice the display's resolution,
both horizontally & vertically [2].

Sorry, but this is wrong.
from
https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
"If a function x(t) contains no frequencies higher than B hertz, it
is completely determined by giving its ordinates at a series of
points spaced 1/(2B) seconds apart.
A sufficient sample-rate is therefore anything larger 2B samples per
second."
Let's say we have 640 horisontal dots (pixels) per line in NTSC
system.

-snip-
Yes, yes, of course. You are correct, but this is different.

The source is not an NTSC analog signal. The source is analog streams
of photons striking a CCD imager array, frame by frame, and applies
to the image regardless whether the image is moving or stationary,
and regardless of exposure time (which affects brightness, not
resolution). The source is a 2-dimensional, lighted field of view in
a camera or film scanner transferring light energy to produce charge
in photo transistors over a spacial area. It's not temporal as is the
case when sampling a changing analog voltage.

Yet I think replacing Voltage with Light Intencity and Time with X
coordinate on analoguie video signal graph changes nothing, if you are
about moving to spatial domain.

So you want to say that if I watching picture on 640x480 dots
display, my brain "effectively" can percept only 320x240 dots.

Hi Anatoly,

The issue is not biology, The issue is pure physics.

In your scenario, your eyes do see 640x480. Your brain does see 640x480. But in order to cleanly
'see' a black-white edge inside those 640x480 dots, the 640x480 dots need to be made from 1280x960
samples within the camera. If the camera made 640x480, then, yes, you would see that edge at 320x240
effective resolution (i.e. fuzzier).

I'm trying to prepare some illustrations that will show how Nyquist works in images. That's proving
to be really hard. Two-dimensional Nyquist is hard to visualize.

In the mean time, and in answer to your's & Eduardo's posts, I'm going to write an explanation
instead of showing an explanation.

The Nyquist criterion is based on physics not biology. In physics, perceiving/measuring physical
properties is based on moving energy from object to observer plus the time that takes. If energy
doesn't move, then there is no measurement or observable result.

The interpretation of Nyquist at Wikipedia addresses 1-dimensional voltage (a signal). What is
presented is limited to one dimension plus time.

However, over a 2-dimensional area (such as a display), the more general interpretation of
resolution is based on the rate of change of energy density per unit of time. Higher energy (e.g.
more light) makes images more resolvable. Higher density (e.g. more dots per square mm) makes images
more resolvable. More time (e.g. longer observation) makes images more resolvable.

Anyone who thinks that Nyquist sampling is limited to signals is wrong. Nyquist sampling applies to
2-dimensional areas, too. (Nyquist applies to 3-dimensions, also, but that's another story.)

Nyquist applies to anything/everything that's converted from analog to digital.

And for
my brain to percept "effectively" 640x480, I need 1280x960 from CCD to
LCD?
Then I may say that at image processing domain such a terms as "640x480"
or "4K" is all about real count of pixels

Re: [FFmpeg-user] Glossary: Nyquist

On 10/01/2020 07:43 PM, Anatoly wrote:

On Wed, 30 Sep 2020 19:21:59 -0400
"Mark Filipak (ffmpeg)" wrote:

Nyquist [adjective]: 1, Reference to the Nyquist-Shannon sampling
theorem. 2, The principle [1] that, to most faithfully reproduce an
image at a given digital display's resolution, the samples must be
made at or above twice the display's resolution, both horizontally
& vertically [2].

Sorry, but this is wrong.
from
https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
"If a function x(t) contains no frequencies higher than B hertz, it is
completely determined by giving its ordinates at a series of points
spaced 1/(2B) seconds apart.
A sufficient sample-rate is therefore anything larger 2B samples per
second."
Let's say we have 640 horisontal dots (pixels) per line in NTSC system.

-snip-
Yes, yes, of course. You are correct, but this is different.

The source is not an NTSC analog signal. The source is analog streams of photons striking a CCD
imager array, frame by frame, and applies to the image regardless whether the image is moving or
stationary, and regardless of exposure time (which affects brightness, not resolution). The source
is a 2-dimensional, lighted field of view in a camera or film scanner transferring light energy to
produce charge in photo transistors over a spacial area. It's not temporal as is the case when
sampling a changing analog voltage.

When sampling an analog voltage, resolution is the ability to resolve voltage value within a certain
period of time (i.e. within a given channel bandwidth). When sampling a visual field of view
however, resolution is the ability to resolve stationary edges that vary spacially, going from light
to dark or dark to light. It's the same gaussian energy transfer issue (i.e. that transferring
energy requires time) with the same signal-to-noise issues and the same handy half-power shorthand,
but it applies to ... wait for it ... human eyes! Human eyes resolve edges only so good, even
totally black abutting totally white. There is nothing you can do about that, and staring at the
edge doesn't bring it into higher resolution. However, if the image source itself has fuzzy edges
because it was sampled at lower than Nyquist, then the result in our brains is a double gaussian,
the first from the CCD and the second from our eyes. It's that double gaussian that is avoided by
spacially sampling at higher than 2x the display resolution.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL


On 2020-10-01 14:49, Mark Filipak (ffmpeg) wrote:

On 10/01/2020 05:13 PM, Mark Filipak (ffmpeg) wrote:
On 10/01/2020 03:21 PM, Jim DeLaHunt wrote >> OK, then I think what you have — what you put in 
your text attachment "d-NTSC & d-PAL .txt" in
your previous message — is two single-line glossary entries, conjoined with entries from a table 
mapping H.262 Metadata Values to video types d-NTSC and d-PAL.

-snip-

I'll try again.


How's this look, Jim? Clear? Or muddled?

d-NTSC [noun]: The digital equivalent of NTSC. d-NTSC is distinguished
  by a frame having all 8, MPEG-PS binary metadata values below.
   'aspect_ratio_information' = 0010  [1]
    'frame_rate_code' = 0100  [1]
[… snip …]



It is an improvement that you are now stating the context, "MPEG-PS binary metadata values". You 
omitted that context before. But you continue to put #2 in the glossary entry, and I continue to be 
of the opinion that the glossary is the wrong place for the content.   The details of the table 
formatting of #2 is a side issue in this discussion.


What is #2?


--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL


On 10/01/2020 05:13 PM, Mark Filipak (ffmpeg) wrote:

On 10/01/2020 03:21 PM, Jim DeLaHunt wrote >> OK, then I think what you have — what you put 
in your text attachment "d-NTSC & d-PAL .txt" in
your previous message — is two single-line glossary entries, conjoined with entries from a table 
mapping H.262 Metadata Values to video types d-NTSC and d-PAL.

-snip-

I'll try again.


How's this look, Jim? Clear? Or muddled?

d-NTSC [noun]: The digital equivalent of NTSC. d-NTSC is distinguished
  by a frame having all 8, MPEG-PS binary metadata values below.
   'aspect_ratio_information' = 0010  [1]
'frame_rate_code' = 0100  [1]
  'horizontal_size_value' = 0010 1101 [1]
'vertical_size_value' = 0001 1110 [1]
  'horizontal_size_extension' = 00[2]
'vertical_size_extension' = 00[2]
   'progressive_sequence' = 0 [2]
  'progressive_frame' = 0 [3]
  [1] From sequence_header
  [2] From sequence_extension
  [3] From picture_coding_extension

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL


On 10/01/2020 03:21 PM, Jim DeLaHunt wrote:
-snip-
OK, then I think what you have — what you put in your text attachment "d-NTSC & d-PAL .txt" in your 
previous message — is two single-line glossary entries, conjoined with entries from a table mapping 
H.262 Metadata Values to video types d-NTSC and d-PAL.


Well, that deserves an answer. There are two parts to each glossary entry: 1, A simple statement of 
what a thing is, and 2, a "distinguished by" identification so that readers can identify the thing. 
The table is the "distinguished by" part. I formatted it as a table to make it easy, or at least so 
I thought! :-)


For example, d-NTSC is distinguished via MPEG-PS binary metadata:
'progressive_sequence' == 0 & 'progressive_frame' == 0 & 'aspect_ratio_information' == 0010 & 
'frame_rate_code' == 0100 & 'horizontal_size_value' == 0010 1101  & 'horizontal_size_extension' 
== 00 & 'vertical_size_value' == 0001 1110  & 'vertical_size_extension' == 00.


That's the logic. Putting the table into a different glossary entry wouldn't work, but obviously, 
the way I'm doing it now is a FAIL.


I'll try again.

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL

On 10/01/2020 02:01 PM, Phil Rhodes via ffmpeg-user wrote:

I don't know who's in charge of this glossary project, but can I please
propose something on the difference between colourspace, subsampling and
luminance encoding. And all the other things people constantly confuse.

No one's in charge. I've taken the initiative to begin (and conduct and finish and whatever it
takes) the effort [1]. Based on the evidence, I'm not alone, but no one's in charge.

Regarding your concerns, Phil, I believe I have a lock on subsampling & luminance encoding (or, at
least, the structures, but not the processes) and I would post glossary entries if I could, but
they're very large and have large texie pix diagrams and are formatted to 148 columns of text and
therefore don't fit into email format, and I would probably incur the ire of the list maintainers if
I were to attach such large files. Sorry,

Regarding colorspace, such documentation exists, at least in name -- literally, names only to the
best of my knowledge -- but the details of the data structures are, at best: buried in code
algorithms without explicit structural documentation, or, at worst: completely lacking and subject
to trial-and-error hacking, even by the developers.

[1] There are 3 ways to take leadership: #1, be appointed by some authority, or #2, be elected by
some constituency, or #3, start leading. I have tried #1 and #2 and have not been successful due to
wrangling, so I am pursuing route #3. If that rankles some folks, the fault is entirely mine.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL


On 10/01/2020 01:03 PM, Jim DeLaHunt wrote:

On 2020-10-01 06:27, Mark Filipak (ffmpeg) wrote:


On 09/30/2020 11:56 PM, Jim DeLaHunt wrote:

On 2020-09-30 20:36, Mark Filipak (ffmpeg) wrote:


Continuing with relatively non controversial entries:

d-NTSC [noun]: 1, The digital equivalent of NTSC distinguished by
  binary metadata:
  720 samples/row: 'horizontal_size_value' = 0010 1101 
   'horizontal_size_extension' = 00
  480 rows: 'vertical_size_value' = 0001 1110 
    'vertical_size_extension' = 00
  4:3 DAR: 'aspect_ratio_information' = 0010
  30/1.001 FPS: 'frame_rate_code' = 0100
  'progressive_sequence' = 0 & 'progressive_frame' = 0

d-PAL [noun]: 1, The digital equivalent of PAL distinguished by binary
  metadata:
  720 samples/row: 'horizontal_size_value' = 0010 1101 
   'horizontal_size_extension' = 00
  576 rows: 'vertical_size_value' = 0010 0100 
    'vertical_size_extension' = 00
  4:3 DAR: 'aspect_ratio_information' = 0010
  25 FPS: 'frame_rate_code' = 0011
  'progressive_sequence' = 0 & 'progressive_frame' = 0 



It seems to me that these are no longer glossary entries — or, only the first line of each is a 
glossary entry. ...


…The sentence is a statement followed by a list of metadata that distinguishes 
the subject...
Do you have any suggestions? Should I just forget this glossary idea?



What makes sense to me is a glossary which includes the entries:

d-NTSC [noun]: 1, The digital equivalent of NTSC

d-PAL [noun]: 1, The digital equivalent of PAL

Then a table of XYZ metadata entries which have been found in the wild:


Actually, not in the wild. They are from H.262.


[Display the following table with fixed-width font]


Actually, Jim, what you sent is not in fixed-width font. :-)
So, you favor a formal table? I don't, but I guess that could go without 
saying, eh?
I format in plain text at 148 columns. I've found 70 columns to be too restricting and promotes 
crypticism (...is there such a word?). Personally, I don't think formal tables are needed, but if 
you do, then I'll make a formal table. But first, take a look at the current version, attached.


--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
d-NTSC [noun]: The digital equivalent of NTSC. d-NTSC is distinguished via 
MPEG-PS binary metadata:
  'progressive_sequence' = 0  + 'progressive_frame' = 0 
 ...is a scan frame
  'aspect_ratio_information' = 0010   +   'frame_rate_code' = 
0100   ...4:3 DAR + 30/1.001 FPS
 'horizontal_size_value' = 0010 1101  + 'horizontal_size_extension' = 
00 ...720 samples/row
   'vertical_size_value' = 0001 1110  +   'vertical_size_extension' = 
00 ...480 rows

d-PAL [noun]: The digital equivalent of PAL. d-PAL is distinguished via MPEG-PS 
binary metadata:
  'progressive_sequence' = 0  + 'progressive_frame' = 0 
 ...is a scan frame
  'aspect_ratio_information' = 0010   +   'frame_rate_code' = 
0011   ...4:3 DAR + 25 FPS
 'horizontal_size_value' = 0010 1101  + 'horizontal_size_extension' = 
00 ...720 samples/row
   'vertical_size_value' = 0010 0100  +   'vertical_size_extension' = 
00 ...576 rows
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

On 10/01/2020 12:16 PM, Greg Oliver wrote:

On Wed, Sep 30, 2020 at 6:25 PM Mark Filipak (ffmpeg)
wrote:

-snip-

Mark,

Normally I would absolutely defend your queries as they are technical and
lower level, but I would almost have to side with Bouke from post
(
bwdif filter question
)

You are trying to get free editing for your book now.

I have no book. I intend to have no book. I'm a retired engineer and don't need book proceeds. I
intend to give everything to the ffmpeg project (and anyone else who finds it useful) for free and
unconditionally. It is all public domain. By simply posting it, I'm making it all public domain.

I do not agree with
that.. There are many good contributors and inquisitors (you included),
but (IMHO) you cannot solicit things like this that are grammatical rather
than technical. I think a lot of the developers are also in the same boat
as you (sometimes) try to re-define things that are common language (even
if not accurate technically).

I'm working on a glossary, not a dictionary. I have no desire to re-define
common language.

eg - your definition if interlaced versus interweaved.. No matter if you
are right or wrong, the concept and understanding of a majority will
prevail - no exceptions.

We shall see, eh? If there's power in (better?) terms, then they will prevail. If not, then they
will die. For what it's worth, I've never written the word "interweaved".

Certainly, to cite just one realm, the current nomenclature is quite confused regarding pre-decoding
streams v. post-decoding processing. The H.xxx folks leave interpretation to "context". But relying
on context relies on understanding, and it is understanding that is lacking. Which would you shoot
first? The chicken or the egg? -- Buy this concept or I shoot the dog.

Please (for me at least) keep your posts here related to ffmpeg and not
trying to change the nomenclature of what exists. We are all using the
same software, so whatever the software uses for terminology (as this list
is exactly related to), please do not interfere with that.

My experience is that the entire video field, not just ffmpeg, is grossly underspecified. That hurts
users and developers -- a lot of time is wasted and a lot of feelings are hurt. Based on my 47 years
of engineering experience, the first things that need to be fixed is to unequivocally and
unambiguously define all the terms & structures. To me, that's the low hanging fruit. Then comes the
processes, but once the terms & structures are nailed down, I think we'll all discover that
documenting the processes will be a snap.

Take that up directly with developers and let them sort it out.

I would/could never stop them from contributing. But it should be acknowledged that the developers
have a developer's perspective. The developer view is like looking out at the world through a pinhole.

On a side note - I have yet seen one of your definitions of a technology
been held up when a developer chimes in - no hard feelings, just that
industry terminology is hard to trump :)

Oh, believe me, you've seen nothing yet. I ponder terminology and anguish over every word choice for
a long, long time. I doggedly seek to manufacture terms that are intuitive and acceptable to all.

The developers have their opinions and have not been shy sharing it. To be honest, I don't see how
this (my glossary) can even be an issue. I'm an ffmpeg user and so long as I'm courteous and focus
on video issues, the developers should welcome me. If not, then I should be removed from the
ffmpeg-user list.

Give this journey the time that it deserves. We all have the same destination in sight, just
differing paths to get there. Perhaps there exists no single path, eh?

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist

On 10/01/2020 11:20 AM, Eduardo Alarcón wrote:

El jue., 1 oct. 2020 a las 12:09, Mark Filipak (ffmpeg) ()
escribió:

On 10/01/2020 10:38 AM, Eduardo Alarcón wrote:

-snip-

Can you suggest better wording? I'd like to see it.

Me too, this is not my native language so i can not suggest a better wording.

Oh, you are doing fine. Trust me on this, being a native English speaker isn't all that it's cracked
up to be -- note the "cracked up" euphemism. :-)
I assume you're a native Spanish speaker. I think that Spanish is a very sensible and logical
language. English started out that way but got wrecked by the principle that anyone should be
allowed to do whatever they want.

-snip-

I think it should say that undersampling makes it look bad or blurry, ...

Do you think the following -- changed "appearance" to "sharpness" & added "from film" in 2 places --
is improved and satisfies your desires?

[1] The Nyquist principle applies to film sampling and to digital cameras, but,
provided that
resolution is unchanged, not to transcoding (because the transcoder inputs
are already
digital). As proved by the improved sharpness of SD media made from 2K flim
samples, SD
mastering prior to the advent of 2K sampling (e.g. DVDs mastered from film
before the advent
of HD) generally ignored the Nyquist principle and were undersampled. HDs
sampled from film
at 2K and UHDs sampled from film at 4K are likewise undersampled.

Off topic, i find your questions helpful or interesting at least on this
list, there are concepts i know and things i don't that i had to look up

Is that "Off topic" in a user list dedicated to video processing? Really? Well, whatever the
opinion, I thank you for your kind words.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist


On 10/01/2020 10:38 AM, Eduardo Alarcón wrote:

Mostly the noun/adjective part, the part [1] about SD media and mastering,
seems to be anecdotal information more than a definition of what is the
Nyquist principle, you say that the images are undersampled, but what does
it mean? what is the impact, the images look blurry?, may be you defined
"undersampling" in other part


Thanks for your thoughts. They're important. Yes, I could write more. And, Yes, the note is 
anecdotal. I felt that the note introduces the idea that something can be undersampled and with a 
concrete example of something that is undersampled so that the concept becomes 'real'. Can you 
suggest better wording? I'd like to see it.


Regarding what undersampling means, it's a common term that can be looked up in a general dictionary 
rather than a dedicated glossary. That's just my opinion of course. Where do you think the line 
should be drawn?


--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: Nyquist


On 10/01/2020 01:20 AM, Eduardo Alarcón wrote:

Nyquist is a noun, not an adjective, for Harry Nyquist. ...


Hi Eduardo, Thanks.

Well, technically, a noun used as an adjective: "Nyquist sampling", makes it an adjective, but no 
matter.


What if I narrow the scope to solely the sampling theory? thusly:

Nyquist sampling: The principle [1] that, to most faithfully reproduce
  an image at a given digital display's resolution, samples must be
  made at or above twice the display's resolution, both horizontally &
  vertically [2].
  [1] The Nyquist principle applies to film sampling and to digital
  cameras, but, provided that resolution is unchanged, not to
  transcoding (because the transcoder inputs are already digital).
  As proved by the improved appearance of SD media made from 2K
  samples, SD mastering prior to the advent of 2K sampling (e.g.
  DVDs mastered before the advent of HD) generally ignored the
  Nyquist principle and were undersampled. HDs sampled at 2K and
  UHDs sampled at 4K are likewise undersampled.
  [2] As a convenience, the Nyquist threshold is currently (in 2020)
  specified solely by horizontal sample count rounded up to whole
  kilo-samples (2K, 4K, 8K).
displayNyquist threshold
  UHD 16:9-2160:  3840 x 2160 8K
   4:3-2160:  2880 x 2160 8K
   HD 16:9-1080:  1920 x 1080 4K
   4:3-1080:  1440 x 1080 4K
   SD  16:9-576:  1024 x 576  4K
4:3-576:   768 x 576  2K
   16:9-480:   853 x 480  2K
4:3-480:   640 x 480  2K


... The Nyquist–Shannon
sampling theorem is applicable to analog to digital conversion of signals
(continuous to discrete), images are a type of signal.


Well, I thought that was what I wrote. What doesn't work for you?

The reason I wrote "both horizontally & vertically" was to resolve that, unlike sampling a 
1-dimensional (serial) signal, 2-dimensional sampling (e.g. from film) or within a camera, requires 
the Nyquist principle be applied in both dimensions. But perhaps that's not what you find lacking. 
Could you suggest different wording maybe?


--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: d-NTSC & d-PAL


On 09/30/2020 11:56 PM, Jim DeLaHunt wrote:

On 2020-09-30 20:36, Mark Filipak (ffmpeg) wrote:


Continuing with relatively non controversial entries:

d-NTSC [noun]: 1, The digital equivalent of NTSC distinguished by
  binary metadata:
  720 samples/row: 'horizontal_size_value' = 0010 1101 
   'horizontal_size_extension' = 00
  480 rows: 'vertical_size_value' = 0001 1110 
    'vertical_size_extension' = 00
  4:3 DAR: 'aspect_ratio_information' = 0010
  30/1.001 FPS: 'frame_rate_code' = 0100
  'progressive_sequence' = 0 & 'progressive_frame' = 0

d-PAL [noun]: 1, The digital equivalent of PAL distinguished by binary
  metadata:
  720 samples/row: 'horizontal_size_value' = 0010 1101 
   'horizontal_size_extension' = 00
  576 rows: 'vertical_size_value' = 0010 0100 
    'vertical_size_extension' = 00
  4:3 DAR: 'aspect_ratio_information' = 0010
  25 FPS: 'frame_rate_code' = 0011
  'progressive_sequence' = 0 & 'progressive_frame' = 0 



It seems to me that these are no longer glossary entries — or, only the first line of each is a 
glossary entry. ...


Each entry is just one sentence, so I guess you don't like the sentence 
spanning multiple lines (?)

The sentence is a statement followed by a list of metadata that distinguishes the subject... so that 
people can determine whether a particular video is a d-NTSC video or not a d-NTSC video for example. 
The main distinguishing feature is 'progressive_sequence' = 0 & 'progressive_frame' = 0 of course. 
The others narrow the scope to just a single species of video.


If a glossary entry requires explanation, then it's a fail. What fails? I guess I didn't anticipate 
such a total-failure mode.


Do you have any suggestions? Should I just forget this glossary idea?

... The rest seems to be a description of a data structure or representation. The entry 
doesn't say to what format or specification the representation applies. To an MPEG-2 video? To a the 
ISO file corresponding to a DVD?



--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] Glossary: d-NTSC & d-PAL


Continuing with relatively non controversial entries:

d-NTSC [noun]: 1, The digital equivalent of NTSC distinguished by
  binary metadata:
  720 samples/row: 'horizontal_size_value' = 0010 1101 
   'horizontal_size_extension' = 00
  480 rows: 'vertical_size_value' = 0001 1110 
'vertical_size_extension' = 00
  4:3 DAR: 'aspect_ratio_information' = 0010
  30/1.001 FPS: 'frame_rate_code' = 0100
  'progressive_sequence' = 0 & 'progressive_frame' = 0

d-PAL [noun]: 1, The digital equivalent of PAL distinguished by binary
  metadata:
  720 samples/row: 'horizontal_size_value' = 0010 1101 
   'horizontal_size_extension' = 00
  576 rows: 'vertical_size_value' = 0010 0100 
'vertical_size_extension' = 00
  4:3 DAR: 'aspect_ratio_information' = 0010
  25 FPS: 'frame_rate_code' = 0011
  'progressive_sequence' = 0 & 'progressive_frame' = 0

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] Glossary: Nyquist


Nyquist [adjective]: 1, Reference to the Nyquist–Shannon sampling
  theorem. 2, The principle [1] that, to most faithfully reproduce an
  image at a given digital display's resolution, the samples must be
  made at or above twice the display's resolution, both horizontally &
  vertically [2].
  [1] The Nyquist principle applies to film sampling and to digital
  cameras, but, provided that resolution is unchanged, not to
  transcoding (because the transcoder inputs are already digital).
  As proved by the improved appearance of SD media made from 2K
  samples, SD mastering prior to the advent of 2K sampling (e.g.
  DVDs mastered before the advent of HD) generally ignored the
  Nyquist principle and were undersampled. HDs sampled at 2K and
  UHDs sampled at 4K are likewise undersampled.
  [2] As a convenience, the Nyquist threshold is currently (in 2020)
  specified solely by horizontal sample count rounded up to whole
  kilo-samples (2K, 4K, 8K).
displayNyquist threshold
  UHD 16:9-2160:  3840 x 2160 8K
   4:3-2160:  2880 x 2160 8K
   HD 16:9-1080:  1920 x 1080 4K
   4:3-1080:  1440 x 1080 4K
   SD  16:9-576:  1024 x 576  4K
4:3-576:   768 x 576  2K
   16:9-480:   853 x 480  2K
4:3-480:   640 x 480  2K

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: DAR, PAR, and SAR


Revision: Shorter sentences, better consistency, no extra 'lesson' about how to 
minimize ratios.\
Formatted for email, plain text.

DAR (display aspect ratio [1]) [noun]: 1, The width-to-height ratio
  (W:H, e.g. 16:9, 4:3) for the intended display. DAR is distingushed
  by metadata: 'aspect_ratio_information', (also see "SAR", note [2]).
  2, H.262 §3.44: "The ratio of height divided by width (in spatial
  measurement units such as centimetres) of the intended display."
  [2].
  [1] It's mistakenly asserted by some that "DAR" is an acronym for
  "data aspect ratio" or "disc aspect ratio".
  [2] Criticism: H.262 claims that DAR is a "ratio", then wrongly
  defines it as a quotient (which it turns upside down): "height
  divided by width"; also H.262 §6.3.3 (i.e. "3 ÷ 4", "9 ÷ 16").

PAR (picture aspect ratio [1]) [noun]: 1, The horizontal-to-vertical
  size [3] ratio (H:V, e.g. 5:4, 3:2) for pictures. PAR is
  distingushed by metadata: 'horizontal_size' & 'vertical_size', [2]
  [4] (also see "SAR", note [2]).
  [1] It's mistakenly asserted by some that "PAR" is an acronym for
  "pixel aspect ratio".
  [2] PAR can also be calculated from DAR & SAR thusly: PAR = DAR/SAR.
  [3] Note that PAR is virtual (i.e. defined by dataset indexes, not
  by physical dimensions).
  [4] Criticism: H.262 doesn't define PAR, however, it does define a
  quotent that correlates with PAR, to wit: H.262 §6.3.3,
  aspect_ratio_information:
"SAR = DAR × horizontal_size/vertical_size".
  The foregoing implies that H.262 would have defined PAR as
  vertical_size/horizontal_size. Opinion: By defining DAR & SAR as
  quotients (which it turns upside down), and by implying that
  metadata: 'aspect_ratio_information', is also such a quotient
  (which it also turns upside down), H.262 causes much confusion
  that helps explain why so many Internet sites get DAR, PAR, and
  SAR wrong.

SAR (sample aspect ratio [1]) [noun]: 1, The physical horizontal-to-
  vertical spacing ratio (H:V) for samples [2][3]. SAR is
  distinguished as a computed value: DAR/SAR. 2, H.262 §3.114: "This
  specifies the relative distance between samples. It is defined (for
  the purposes of Rec. ITU-T H.262 | ISO/IEC 13818-2), as the vertical
  displacement of the lines of luminance samples in a frame divided by
  the horizontal displacement of the luminance samples [2]. Thus, its
  units are (metres per line) ÷ (metres per sample)." [7].
  [1] It's mistakenly asserted by some that "SAR" is an acronym for
  "storage aspect ratio".
  [2] A standardized set of picture sizes & aspects has been
  established:
 display DAR picture PARSAR = DAR/PAR
   16:9-2160:  3840 x 2160  16:9 : 3840 x 2160  16:9 :  1:1
4:3-2160:  2880 x 2160   4:3 : 2880 x 2160   4:3 :  1:1
   16:9-1080:  1920 x 1080  16:9 : 1920 x 1080  16:9 :  1:1
4:3-1080:  1440 x 1080   4:3 : 1440 x 1080   4:3 :  1:1
   16:9-576:   1024 x 576   16:9 :  720 x 5765:4 : 64:45
4:3-576:768 x 5764:3 :  720 x 5765:4 : 16:15
   16:9-480:853 x 480   16:9 :  720 x 4803:2 : 32:27
4:3-480:640 x 4804:3 :  720 x 4803:2 :  8:9  [4]
  [3] Ideally, SAR would also be the width-to-height ratio of the
  sampling aperture, but that is not mandatory.
  [4] Example: If a 35mm film area (0.906 x 0.680 inches) is to
  produce 345,600 samples (visual density) with 480 rows (vertical
  resolution), then each row must have 720 samples (horizontal
  resolution) [5] and sample spacing should be 32 µm horizontally
  by 36 µm vertically [6].
  [5] (345,600 samples)/(480 rows).
  [6] (0.906 in)(25400 µm/in)/720 by (0.680 in)(25400 µm/in)/480 = 32
  by 36 µm = 32:36 = 8:9 aspect ratio.
  [7] Criticism: H.262 claims that SAR is a "ratio", then, as it does
  with DAR, wrongly defines it as a quotient (which it turns
  upside down).

--
What if you woke up and found yourself in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] Glossary: DAR, PAR, and SAR


On 09/30/2020 06:08 PM, Jim DeLaHunt wrote:

On 2020-09-30 13:09, Mark Filipak (ffmpeg) wrote:
I seek your comments regarding the following glossary entries (that have been reformatted to fit 
email). Kindly take your time.


-Mark.

DAR (display aspect ratio [1]) [noun]: 1, … as a minimized, H:V, integer ratio 
(e.g.
  16:9, 4:3)…
(and similar wording for PAR and SAR):



Is the term "minimized aspect ratio" already used in this domain? If not, then the related 
mathematics term "reduced fraction" may be a better inspiration than "minimized".


Excellent! I'll go for it.


SAR [noun]:…
  [2] A standardized set of picture sizes & aspects has been
  established:
 display DAR picture PAR    SAR = DAR/PAR
    16:9-2160: 3840 x 2160  16:9 : 3840 x 2160  16:9 :  1:1
 4:3-2160: 2880 x 2160   4:3 : 2880 x 2160   4:3 :  1:1
    16:9-1080: 1920 x 1080  16:9 : 1920 x 1080  16:9 :  1:1
 4:3-1080: 1440 x 1080   4:3 : 1440 x 1080   4:3 :  1:1
 16:9-576: 1024 x 576   16:9 :  720 x 576    5:4 : 64:45
  4:3-576:  768 x 576    4:3 :  720 x 576    5:4 : 16:15
 16:9-480:  853 x 480   16:9 :  720 x 480    3:2 : 32:27
  4:3-480:  640 x 480    4:3 :  720 x 480    3:2 :  8:9  [3] 



The formatting of the email garbles the table [2] enough that I can't be sure what I'm reading. 
Maybe insert printable delimiters into each line?


Hmmm... It's a table: rows and columns, like a spread sheet. Does making the column headers more 
obvious work better?


  display DAR   picture PAR  SAR = DAR/PAR
===   ===   =
16:9-2160:  3840 x 2160  16:9  :  3840 x 2160  16:9  :   1:1
 4:3-2160:  2880 x 2160   4:3  :  2880 x 2160   4:3  :   1:1
16:9-1080:  1920 x 1080  16:9  :  1920 x 1080  16:9  :   1:1
 4:3-1080:  1440 x 1080   4:3  :  1440 x 1080   4:3  :   1:1
 16:9-576:  1024 x 576   16:9  :   720 x 5765:4  :  64:45
  4:3-576:   768 x 5764:3  :   720 x 5765:4  :  16:15
 16:9-480:   853 x 480   16:9  :   720 x 4803:2  :  32:27
  4:3-480:   640 x 4804:3  :   720 x 4803:2  :   8:9  [3]

Or is it the column centering that doesn't work?

--
What would you do if you woke up in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] Glossary: DAR, PAR, and SAR

I seek your comments regarding the following glossary entries (that have been reformatted to fit 
email). Kindly take your time.


-Mark.

DAR (display aspect ratio [1]) [noun]: 1, The intended display's
  width-to-height aspect as a minimized, H:V, integer ratio (e.g.
  16:9, 4:3) distingushed by metadata: 'aspect_ratio_information'
  (also see "SAR", note [2]). 2, H.262 §3.44: "The ratio of height
  divided by width (in spatial measurement units such as centimetres)
  of the intended display" [2].
  [1] It's mistakenly asserted by some that "DAR" is an acronym for
  "data aspect ratio" or "disc aspect ratio".
  [2] Criticism: H.262 claims that DAR is a "ratio", then wrongly
  defines it as a quotient: "height divided by width"; also H.262
  §6.3.3 (i.e. "3 ÷ 4", "9 ÷ 16").

PAR (picture aspect ratio [1]) [noun]: 1, The metadata ratio:
  'horizontal_size':'vertical_size' [3], as a minimized, H:V, integer
  ratio (e.g. 5:4, 3:2) [2][4] (also see "SAR", note [2]).
  [1] It's mistakenly asserted by some that "PAR" is an acronym for
  "pixel aspect ratio".
  [2] PAR can also be calculated from DAR & SAR thusly: PAR = DAR/SAR.
  [3] Note that PAR is virtual (i.e. defined by dataset indexes, not
  physical dimensions).
  [4] Criticism: H.262 doesn't define PAR, however, it does define a
  quotent that correlates with PAR, to wit: H.262 §6.3.3,
  aspect_ratio_information:
"SAR = DAR × horizontal_size/vertical_size".
  The foregoing implies that H.262 would have defined PAR as
  vertical_size/horizontal_size. Opinion: By defining DAR & SAR as
  quotients that turn the standard ratio definitions on their
  heads, and by implying that metadata:
  'aspect_ratio_information', is also such a quotient that also
  turns the standard ratio definition on its head, H.262 causes
  much confusion that helps explain why so many Internet sites get
  DAR, PAR, and SAR wrong.

SAR [noun]: 1, Sample aspect ratio [1][2], the physical horizontal-to-
  vertical sample spacing [6] as a minimized, H:V, integer ratio [3].
  2, H.262 §3.114: "This specifies the relative distance between
  samples. It is defined (for the purposes of Rec. ITU-T H.262 |
  ISO/IEC 13818-2), as the vertical displacement of the lines of
  luminance samples in a frame divided by the horizontal displacement
  of the luminance samples [2]. Thus, its units are (metres per line)
  ÷ (metres per sample)." [7].
  [1] It's mistakenly asserted by some that "SAR" is an acronym for
  "storage aspect ratio".
  [2] A standardized set of picture sizes & aspects has been
  established:
 display DAR picture PARSAR = DAR/PAR
16:9-2160: 3840 x 2160  16:9 : 3840 x 2160  16:9 :  1:1
 4:3-2160: 2880 x 2160   4:3 : 2880 x 2160   4:3 :  1:1
16:9-1080: 1920 x 1080  16:9 : 1920 x 1080  16:9 :  1:1
 4:3-1080: 1440 x 1080   4:3 : 1440 x 1080   4:3 :  1:1
 16:9-576: 1024 x 576   16:9 :  720 x 5765:4 : 64:45
  4:3-576:  768 x 5764:3 :  720 x 5765:4 : 16:15
 16:9-480:  853 x 480   16:9 :  720 x 4803:2 : 32:27
  4:3-480:  640 x 4804:3 :  720 x 4803:2 :  8:9  [3]
  [3] Example: If a 35mm film area (0.906 x 0.680 inches) is to
  produce 345,600 samples (visual density) with 480 rows (vertical
  resolution), then each row must have 720 samples (horizontal
  resolution) [4] and sample spacing should be 32 µm horizontally
  by 36 µm vertically [5].
  [4] (345,600 samples)/(480 rows).
  [5] (0.906 in)(25400 µm/in)/720 by (0.680 in)(25400 µm/in)/480 = 32
  by 36 µm = 32:36 = 8:9 aspect ratio.
  [6] Ideally, SAR would also be the width-to-height ratio of the
  sampling aperture, but that is not mandatory.
  [7] Criticism: H.262 claims that SAR is a "ratio", then, as it does
  with DAR, (wrongly, in the opinion of many) defines it as a
  quotient.

--
What would you do if you woke up in a police state?
African-Americans wake up in a police state every day.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] should I shoot the dog?


On 09/29/2020 12:57 PM, Devin Heitmueller wrote:

On Tue, Sep 29, 2020 at 12:28 PM Mark Filipak (ffmpeg)
 wrote:

-snip-

I would encourage you stop trying to invent new terminology ...

-snip-

With due respect to you, I'm not trying to invent new terminology. I'm trying to create extended 
terminology that builds on the existing terminology. But we shall see, eh? If what I do is crap, 
then I'll be the first to throw it away. I've thrown away weeks of work in the past.



YCbCr420 sampleset:
A sampleset with sample-quads:
.---.---.
¦ S ¦ S ¦
:---:---:
¦ S ¦ S ¦
'---'---', reduced to 1/4 chrominance resolution:
.---.---. .---. .---.
¦ Y ¦ Y ¦ ¦   ¦ ¦   ¦
:---:---: ¦Cb ¦ ¦Cr ¦
¦ Y ¦ Y ¦ ¦   ¦ ¦   ¦
'---'---' '---' '---', distinguished by binary metadata:
'chroma_format' = 01. (See "Cb420 & Cr420 macroblocks", "Y macroblock".)

YCbCr422 sampleset:
A sampleset with sample-quads:
.---.---.
¦ S ¦ S ¦
:---:---:
¦ S ¦ S ¦
'---'---', reduced to 1/2 chrominance resolution:
.---.---. .---. .---.
¦ Y ¦ Y ¦ ¦Cb ¦ ¦Cr ¦
:---:---: :---: :---:
¦ Y ¦ Y ¦ ¦Cb ¦ ¦Cr ¦
'---'---' '---' '---', distinguished by binary metadata:
'chroma_format' = 10. (See "Cb422 & Cr422 macroblocks", "Y macroblock".)

YCbCr444 sampleset:
A sampleset with sample-quads:
.---.---.
¦ S ¦ S ¦
:---:---:
¦ S ¦ S ¦
'---'---', having full chrominance resolution:
.---.---. .---.---. .---.---.
¦ Y ¦ Y ¦ ¦Cb ¦Cb ¦ ¦Cr ¦Cr ¦
:---:---: :---:---: :---:---:
¦ Y ¦ Y ¦ ¦Cb ¦Cb ¦ ¦Cr ¦Cr ¦
'---'---' '---'---' '---'---', distinguished by binary metadata:
'chroma_format' = 11. (See "Cb444 & Cr444 macroblocks", "Y macroblock".)


The diagrams are probably fine, but probably not how I would draw them
given they blur the relationship between packed and planar.  Either
it's packed, in which case you should probably show 4:2:2 as YCbYCr,
or it's planer in which case the Cb/Cr samples should not be adjacent
per line (i.e. have all the Y lines followed by all the Cr/Cb lines).
You may wish to take into your account your newfound understanding of
packed vs. planar to redo these diagrams representing the data as
either one or the other.


Thank you, Devin. Yes, the diagrams are incomplete. And, yes, I will do diagrams that take planar v. 
packed into account. I will post them when completed. May I also say that I appreciate your 
attitude: That seekers are not stupid or trolls.


Regarding "adjacent per line", the references to "Cb444 & Cr444 macroblocks", "Y macroblock" make 
that clear, but I will revise the note to better indicate that the chroma subsamples are not adjacent.


Regarding "4:2:2 as YCbYCr" packed, I can't fully visualize it because, I think, there should be 4 Y 
samples, not 2. But don't explain it, though. Not yet. Wait until I post a diagram of it and then 
let me know what you think and how that diagram is wrong. :-)


I don't want to exploit your generosity. I'll do the grunt work.


I would probably also refrain from using the term "macroblock" to
describe the raw decoded video, as macroblocks are all about how the
pixels are organized in the compressed domain.  Once they are decoded
there is no notion of macroblocks in the resulting video frames.


Got it. Regarding "compressed domain" (in which macroblocks are sparse), that's what I initially 
thought, but H.262 pretty strongly implies that macroblocks also apply to raw video. That seems 
logical to me (as datasets prior to compression).


Unrelated: In the glossary, I seek to always have "distinguished by" clauses so that readers can be 
sure about when and where a particular definition applies.



... If the video frame is interlaced
however, the first chroma sample corresponds to the first two luma
samples on line 1 and the first two luma samples on line 3.  The first
chroma sample on the second line of chroma corresponds with the first
two luma samples on line 2 and the first two luma samples on line 4.


I have pictures of those, too. What do you think of the above pictures? Do you 
a, like them, or b,
loathe them, or c, find them unnecessary?


I would probably see if you can find drawings already out there.  For
example the Wikipedia article on YUV has some pretty good
representations for pixel arrangement in various pixel formats.  So
does the LinuxTV documentation.


Thanks for the tips.


This is known as "interlaced chroma" and a Google search will reveal
lots of c

Re: [FFmpeg-user] should I shoot the dog?


On 09/29/2020 11:44 AM, Devin Heitmueller wrote:

On Tue, Sep 29, 2020 at 11:29 AM Mark Filipak (ffmpeg)
 wrote:

Oh, dear, that's what "packed" means? ...very misleading name, eh? How are 
fields handled? Are the
pixels assumed to be unfielded (meaning so-called "progressive")?


So the topic of how interlaced video is handled in subsampled video is
something we could spend an hour on by itself.  In the Luma space, the
Y samples are organized in interleaved form (i.e. lines of
top/bottom/top/bottom). ...


Top/bottom/top/bottom, especially if full lines, seems like straightforward interlaced to me. Or do 
I misunderstand?



... Because of chroma subsampling and the fact
that multiple lines can share chroma samples, this gets tricky. ...


Our messages crossed in transit, but I'm going to assume that you've seen my "In macroblock 
format..." post (in this subject thread).



... In
the simple progressive case for 4:2:0, you'll have the first Chroma
sample corresponding to the first two luma samples on line 1 and the
first two luma samples on line 2. ...


I assume you meant to write "and the *next* two luma samples on line 2". That 'sounds' like what I'm 
calling sample-quads. The following is from the glossary I'm working on (reformatted to fit email).


YCbCr420 sampleset:
  A sampleset with sample-quads:
  .---.---.
  ¦ S ¦ S ¦
  :---:---:
  ¦ S ¦ S ¦
  '---'---', reduced to 1/4 chrominance resolution:
  .---.---. .---. .---.
  ¦ Y ¦ Y ¦ ¦   ¦ ¦   ¦
  :---:---: ¦Cb ¦ ¦Cr ¦
  ¦ Y ¦ Y ¦ ¦   ¦ ¦   ¦
  '---'---' '---' '---', distinguished by binary metadata:
  'chroma_format' = 01. (See "Cb420 & Cr420 macroblocks", "Y macroblock".)

YCbCr422 sampleset:
  A sampleset with sample-quads:
  .---.---.
  ¦ S ¦ S ¦
  :---:---:
  ¦ S ¦ S ¦
  '---'---', reduced to 1/2 chrominance resolution:
  .---.---. .---. .---.
  ¦ Y ¦ Y ¦ ¦Cb ¦ ¦Cr ¦
  :---:---: :---: :---:
  ¦ Y ¦ Y ¦ ¦Cb ¦ ¦Cr ¦
  '---'---' '---' '---', distinguished by binary metadata:
  'chroma_format' = 10. (See "Cb422 & Cr422 macroblocks", "Y macroblock".)

YCbCr444 sampleset:
  A sampleset with sample-quads:
  .---.---.
  ¦ S ¦ S ¦
  :---:---:
  ¦ S ¦ S ¦
  '---'---', having full chrominance resolution:
  .---.---. .---.---. .---.---.
  ¦ Y ¦ Y ¦ ¦Cb ¦Cb ¦ ¦Cr ¦Cr ¦
  :---:---: :---:---: :---:---:
  ¦ Y ¦ Y ¦ ¦Cb ¦Cb ¦ ¦Cr ¦Cr ¦
  '---'---' '---'---' '---'---', distinguished by binary metadata:
  'chroma_format' = 11. (See "Cb444 & Cr444 macroblocks", "Y macroblock".)


... If the video frame is interlaced
however, the first chroma sample corresponds to the first two luma
samples on line 1 and the first two luma samples on line 3.  The first
chroma sample on the second line of chroma corresponds with the first
two luma samples on line 2 and the first two luma samples on line 4.


I have pictures of those, too. What do you think of the above pictures? Do you a, like them, or b, 
loathe them, or c, find them unnecessary?



This is known as "interlaced chroma" and a Google search will reveal
lots of cases where it's done wrong and what the effects are.  This is
the article I usually refer people to:

https://hometheaterhifi.com/technical/technical-reviews/the-chroma-upsampling-error-and-the-420-interlaced-chroma-problem/

The above article does a really good job explaining the behavior (far
better than I could do in the one paragraph above).


I've seen that produce mild combing. I'll read your reference.

--
--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] should I shoot the dog?


On 09/29/2020 11:09 AM, Dave Stevens wrote:

On Tue, 29 Sep 2020 10:48:42 -0400
"Mark Filipak (ffmpeg)"  wrote:


Hi Devin. Thanks much!

Your response came in while I was composing my previous message. I
see (below) that performance is a


Because it reverses the normal order of reading!

Why not top post?


Hi Dave,

Top posting is discouraged in the ffmpeg-user list. I personally loathe top posting and prefer an 
interleaved, call-and-response model. However, in the cited case, I felt that call-and-response 
would not have worked and would simply have been boring and "me too". In other words, I just wanted 
to acknowledge Devin's contribution and thank him one time, in one place.


--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] should I shoot the dog?


On 09/29/2020 10:46 AM, Michael Koch wrote:

Am 29.09.2020 um 16:26 schrieb Mark Filipak (ffmpeg):

On 09/29/2020 09:37 AM, Michael Koch wrote:

Am 29.09.2020 um 14:58 schrieb Mark Filipak (ffmpeg):

On 09/29/2020 04:06 AM, Michael Koch wrote:

Am 29.09.2020 um 04:28 schrieb Mark Filipak (ffmpeg):


I just want to understand the frame structures that ffmpeg creates, and that ffmpeg uses in 
processing and filtering. Are Y, Cb, Cr separate buffers? That would be logical. Or are the Y, 
Cb, Cr values combined and organized similarly to macroblocks? I've found some code that 
supports that. Or are the Y, Cb, Cr values thrown together, pixel-by-pixel. That would be 
logical, too.


As far as I understood it, that depends on the pixel format.
For example there are "packed" pixel formats rgb24, bgr24, argb, rgba, abgr, bgra,rgb48be, 
rgb48le, bgr48be, bgr48le.

And there are "planar" pixel formats gbrp, bgrp16be, bgrp16le.


Hi Michael,

"Packed" and "planar", eh? What evidence do you have? ...Share the candy!


As far as I know, this is not described in the official documentation. You can find it for 
example here:

https://video.stackexchange.com/questions/16374/ffmpeg-pixel-format-definitions


Thanks for that. It saved me some time. ...So, what does "planar" mean? What does 
"packed" mean?


Here is an example for a very small image with 3 x 2 pixels.
In (packed) RGB24 format:   RGBRGBRGBRGBRGBRGB


Oh, dear, that's what "packed" means? ...very misleading name, eh? How are fields handled? Are the 
pixels assumed to be unfielded (meaning so-called "progressive")?



In (planar) GBRP format: GGBBRR


What about fields?

In macroblock format, samples are 1st spacially divided vertically into by-16 slices, then spacially 
divided within slices into by-16 macroblocks, then, within macroblocks, divided by into (combined) 
colorant-field blocks: Ytop Ytop Ybottom Ybottom Cb Cr, and, within Cb Cr colorants, into field 
half-blocks, and finally, interleaved by 2 levels of interleaving. ...An overly complicated and (to 
me) ill-conceived set of datasets that illustrates (to me) that the video "engineers" of the Motion 
Pictures Expert Group are lightweight engineers and have hacked a "system".


It is structure to the field-level that I'm most interested in, but a deep dive 
would be fun.


--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] should I shoot the dog?

On 09/29/2020 09:20 AM, Devin Heitmueller wrote:

Hi Mark,

Hi Devin. Thanks much!

Your response came in while I was composing my previous message. I see (below) that performance is a
major issue. That absolutely makes sense because, after accuracy, speed is the next most important
objective (and for some use cases, may actually be more important).

I imagine that format-to-format conversion is probably the most optimized code in ffmpeg. Is there a
function library dedicated solely to format conversion? I ask so that, in what I write, I can assure
users that the issues are known and addressed.

For my modest purposes, a sketch of planar v. packed is probably all that's needed. I think you've
made "planar" clear. Thank you for that. I can imagine that the structure of packed is
multitudinous. Why is it called "packed"? How is it packed? Are the luma & chroma mixed in one
buffer (analogous to blocks in macroblocks) or split into discrete buffers? How are they spacially
structured? Is there any special sub structures (analogous to macroblocks in slices)? Are the sub
structures, if any, format dependent?

So when you talk about the decoded frames, there is no concept of
macroblocks. There are simple video frames with Y, Cb, Cr samples.
How those samples are organized and their sizes are determined by the
AVFrame format.

"Packed" and "planar", eh? What evidence do you have? ...Share the candy!

Now, I'm not talking about streams. I'm talking about after decoding. I'm
talking about the buffers.
I would think that a single, consistent format would be used.

When dealing with typical consumer MPEG-2 or H.264 video, the decoded
frames will typically have what's referred to as "4:2:0 planar"
format. This means that the individual Y/Cb/Cr samples are not
contiguous. If you look at the underlying data that makes up the
frame as an array, you will typically have W*H Y values, followed by
W*H/4 Cb values, and then there will be W*H/4 Cr values. Note that I
say "values" and not "bytes", as the size of each value may vary
depending on the pixel format.

Unfortunately there is no "single, consistent format" because of the
variety of different ways in which the video can be encoded, as well
as performance concerns. Normalizing it to a single format can have a
huge performance cost, in particular if the original video is in a
different colorspace (e.g. the video is YUV and you want RGB).
Generally speaking you can set up the pipeline to always deliver you a
single format, and ffmpeg will automatically perform any
transformations necessary to achieve that (e.g. convert from packed to
planer, RGB to YUV, 8-bit to 10-bit, 4:2:2 to 4:2:0, etc). However
this can have a severe performance cost and can result in quality loss
depending on the transforms required.

The codec will typically specify its output format, largely dependent
on the nature of the encoding, and then announce AVFrames that conform
to that format. Since you're largely dealing with MPEG-2 and H.264
video, it's almost always going to be YUV 4:2:0 planar. The filter
pipeline can then do conversion if needed, either because you told it
to convert it or because you specified some filter pipeline where the
individual filter didn't support what format it was being given.

See libavutil/pixfmt.h for a list of all the possible formats in which
AVFrames can be announced by a codec.

Devin

--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] should I shoot the dog?

On 09/29/2020 09:37 AM, Michael Koch wrote:

Am 29.09.2020 um 14:58 schrieb Mark Filipak (ffmpeg):

On 09/29/2020 04:06 AM, Michael Koch wrote:

Am 29.09.2020 um 04:28 schrieb Mark Filipak (ffmpeg):

I just want to understand the frame structures that ffmpeg creates, and that ffmpeg uses in
processing and filtering. Are Y, Cb, Cr separate buffers? That would be logical. Or are the Y,
Cb, Cr values combined and organized similarly to macroblocks? I've found some code that
supports that. Or are the Y, Cb, Cr values thrown together, pixel-by-pixel. That would be
logical, too.

As far as I understood it, that depends on the pixel format.
For example there are "packed" pixel formats rgb24, bgr24, argb, rgba, abgr, bgra,rgb48be,
rgb48le, bgr48be, bgr48le.

And there are "planar" pixel formats gbrp, bgrp16be, bgrp16le.

Hi Michael,

"Packed" and "planar", eh? What evidence do you have? ...Share the candy!

As far as I know, this is not described in the official documentation. You can find it for example
here:

https://video.stackexchange.com/questions/16374/ffmpeg-pixel-format-definitions

Thanks for that. It saved me some time. ...So, what does "planar" mean? What does
"packed" mean?

Now, I'm not talking about streams. I'm talking about after decoding. I'm talking about the
buffers. I would think that a single, consistent format would be used.

There is no single consistent format used internally. See Gyan's answer here:
http://ffmpeg.org/pipermail/ffmpeg-user/2020-September/050031.html

And thanks for that. So, ffmpeg really is a Tower of Babel, eh? That does not seem wise to me. That
seems like a great way to generate bugs in addition to confusion.

Now, I imagine that converting to a lingua franca would blow up processing time, so it isn't done.
However, if there are format-to-format regular expressions for conversions, may I suggest that those
regular expressions be published? Also, if Y Cb & Cr are separate buffers, may I suggest that ffmpeg
publish that?

In school, I learned that inputs and outputs should be fully characterized, not suggested, not
implied, but fully characterized as structures. That takes time, and it takes review and perfection
by informed people, but that time is worth the investment.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] should I shoot the dog?

On 09/29/2020 04:06 AM, Michael Koch wrote:

Am 29.09.2020 um 04:28 schrieb Mark Filipak (ffmpeg):

I just want to understand the frame structures that ffmpeg creates, and that ffmpeg uses in
processing and filtering. Are Y, Cb, Cr separate buffers? That would be logical. Or are the Y, Cb,
Cr values combined and organized similarly to macroblocks? I've found some code that supports
that. Or are the Y, Cb, Cr values thrown together, pixel-by-pixel. That would be logical, too.

As far as I understood it, that depends on the pixel format.
For example there are "packed" pixel formats rgb24, bgr24, argb, rgba, abgr, bgra,rgb48be, rgb48le,
bgr48be, bgr48le.

And there are "planar" pixel formats gbrp, bgrp16be, bgrp16le.

Hi Michael,

"Packed" and "planar", eh? What evidence do you have? ...Share the candy!

Now, I'm not talking about streams. I'm talking about after decoding. I'm talking about the buffers.
I would think that a single, consistent format would be used.

? ? ? ? ?
So, why am I interested in ffmpeg's internal video buffer format? ...I've been here for about 1/2
year now, watching the ffmpeg, slow motion train wreck. It seems to me that the ffmpeg patricians
assume that everyone knows the formats just because the patricians do, and have documented based on
that assumption. Because we plebians don't know the format, and we don't know that we don't know,
the patricians get frustrated with us and become short tempered and then the word "troll" flies.

I'm just a simple engineer. To understand an architecture, all I need is the structures, preferably
as pictures, and maybe a bit of the processing flow, preferably via flow diagrams (i.e. step '1',
then step '2', then step '3', etc.) -- I'm a visual kinda guy -- but I usually don't need to know
the processing.

Examining the source code doesn't work for me. 'C' code is just too cryptic and
I'm too ignorant.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] should I shoot the dog?


I've spent 2 days studying frame.h, pixfmt.h, dpx.c, packet.h, and escape124.c.
I haven't learned a damn thing.

Despite their vagueness and ambiguity, reading and understanding H.222 & H.262 are dead easy by 
comparison [1].


I just want to understand the frame structures that ffmpeg creates, and that ffmpeg uses in 
processing and filtering. Are Y, Cb, Cr separate buffers? That would be logical. Or are the Y, Cb, 
Cr values combined and organized similarly to macroblocks? I've found some code that supports that. 
Or are the Y, Cb, Cr values thrown together, pixel-by-pixel. That would be logical, too.


I really can't understand how anyone can architect these things without making 
some pictures.

Can anyone here help me, or should I shoot the dog?

Notes:
[1] Reading and understanding H.222 & H.262 is slightly easier than 
self-administered appendectomy.

--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] AVFrame, AV_NUM_DATA_POINTERS

On 09/28/2020 03:49 PM, James Darnley wrote:

On 28/09/2020, Mark Filipak (ffmpeg) wrote:

On 09/27/2020 03:31 PM, James Darnley wrote:

On 27/09/2020, Mark Filipak (ffmpeg) wrote:

2, Are the width & height indexes in bytes or samples? If bytes, how are
8-bit v. 10-bit v. 12-bit
pixel formats handled at the index generation end?

Width and height are given in pixels. How that relates to bytes in
memory depends on the pixel format. Different planes can have
different sizes like in the extremely common yuv420p.

Ah-ha #1. I think you've answered another question. The planes that ffmpeg
refers to are the Y, Cb,
and Cr samples, is that correct?

If the pixel format is a YCbCr format, such as yuv420p, then yes. If
it matters to you: I am not sure of the exact order of the planes. It
is probably documented in the pixel format header.

Yes, I'm familiar with pixfmt.h
I find this surprising. But then, ffmpeg is full of surprises, eh?

I anticipated there would be a single ffmpeg video processing/pipeline format that decoders would
provision. Many, differing pixel formats seems a point of complexity that promotes error.

Regarding the order of the planes, I suspect there is none. I've not examined the source code, but I
suspect that 3 unique buffer pointers are supplied to the decoder. Also surprising is that the word
"plane" is apparently used for both video and audio.

RGB is also available and maybe some other more niche ones. Oh, alpha
channels too. Again see the pixel format.

So, I'm going to make some statements that can be confirmed or refuted --
making statements rather
than asking questions is just part of my training, not arrogance. Statements
are usually clearer.
I'm trying to nail down the structures for integration into my glossary.

What do you mean by 2 dimensional?

Width x Height.

IMO you should think of the planes
as a single block of memory each. The first pixel will be the first
byte. In your example the first plane in a yuv420p picture will be at
least 720*576 bytes long. The two chroma planes will have 360x288
samples each with their own linesize. I'm not sure how you got
180x144. The subsampling is only a factor of 2 for 4:2:0.

I don't know what you mean. In 4:2:0 format, there are 1 each of Cb & Cr for
every 4 Y.
180x144 = (720/2)x(576/2). ...Argh! Wrong! ...Duh?

Of course I should have written 360x288 -- my bad. 8-] ...brain fart! (How
embarrassing.)

The linesize can make it larger than that. The linesize also says how
many bytes are between the start of a row and the start of the next.

The same color space and subsampling could be expressed in a few
different ways. Again it is the pixel format which says how the data
is laid out in memory. You will probably have yuv420p

Specifically, the decoder's output is not in macroblock format, correct? The
reason I ask for
confirmation is that H.262 implies that even raw pictures are in macroblock
format, improbable as
that may seem.

An AVFrame might not come from a source that has macroblocks. I have
no idea what H.262 says.

Okay, some architecture, okay? I'm interested in how ffmpeg programmatically represents frames
during processing. (Frames are represented as (W/16)*(H/16) number of macroblocks in MPEG-PSs.)

Byte order
(endianess) of larger samples depends on the pixel format (but it is
usually native). The number of bytes used for a sample is given in
the pixel format. The bits are in the low N bits.

Ah-ha #2. I think you've answered yet another question: The arrays are
bytes, not bits, correct? So,
going from 8-bit samples to 10-bit samples doubles the sizes of the arrays,
correct?

You cannot easily address bits in C and ffmpeg doesn't bother with bit
fields. yuv420p10 will use 16-bit words with the samples in the low
10 bits and the high 6 are zero. This does have the effect of
doubling the size of the memory buffers.

P.S. When I say pixel format I mean the specific ffmpeg feature.

Understood.

Thanks again, James. I'm going to assume that Y, Cb, and Cr are buffered separately, i.e. that
there's no frame struct per se.

I think that wraps it up vis-a-vis ffmpeg internal representation of video.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] frame.h error?


In frame.h, I think that this line:
  352  * but for planar audio with more channels that can fit in data,
should be:
  352  * but for planar audio with more channels than can fit in data,

How do I confirm it and make the correction?

Thanks!

--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] AVFrame, AV_NUM_DATA_POINTERS

On 09/27/2020 03:31 PM, James Darnley wrote:

On 27/09/2020, Mark Filipak (ffmpeg) wrote:

From https://www.ffmpeg.org/doxygen/trunk/frame_8h_source.html#l00309
typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8
/**
* pointer to the picture/channel planes.
* This might be different from the first allocated byte
*
* Some decoders access areas outside 0,0 - width,height...

1, Are samples & lines actually indexed from zero? I ask because, if so,
then shouldn't the extents
should be 0,0 - (width-1),(height-1)? -- the descrepancy makes me unsure how
to interpret what I read.

Yes, from 0. Everything in C is indexed from 0 because they are
pointer offsets. Maybe that document should say what you suggest.

Thanks, James.

2, Are the width & height indexes in bytes or samples? If bytes, how are
8-bit v. 10-bit v. 12-bit
pixel formats handled at the index generation end?

Width and height are given in pixels. How that relates to bytes in
memory depends on the pixel format. Different planes can have
different sizes like in the extremely common yuv420p.

Ah-ha #1. I think you've answered another question. The planes that ffmpeg refers to are the Y, Cb,
and Cr samples, is that correct?

So, I'm going to make some statements that can be confirmed or refuted -- making statements rather
than asking questions is just part of my training, not arrogance. Statements are usually clearer.
I'm trying to nail down the structures for integration into my glossary.

For YCbCr420, 8-bit, 720x576 (for example), the planes are separate and the
structures are:
Y: 2-dimensional, 720x576 byte array.
Cb: 2-dimensional, 180x144 byte array.
Cr: 2-dimensional, 180x144 byte array.
Specifically, the decoder's output is not in macroblock format, correct? The reason I ask for
confirmation is that H.262 implies that even raw pictures are in macroblock format, improbable as
that may seem.

Byte order
(endianess) of larger samples depends on the pixel format (but it is
usually native). The number of bytes used for a sample is given in
the pixel format. The bits are in the low N bits.

Ah-ha #2. I think you've answered yet another question: The arrays are bytes, not bits, correct? So,
going from 8-bit samples to 10-bit samples doubles the sizes of the arrays, correct?

Thanks so much, James. Warm regards,
Mark.

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-user] AVFrame, AV_NUM_DATA_POINTERS

2020-09-27 Thread Mark Filipak (ffmpeg)


From https://www.ffmpeg.org/doxygen/trunk/frame_8h_source.html#l00309
typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8
/**
* pointer to the picture/channel planes.
* This might be different from the first allocated byte
*
* Some decoders access areas outside 0,0 - width,height...

1, Are samples & lines actually indexed from zero? I ask because, if so, then shouldn't the extents 
should be 0,0 - (width-1),(height-1)? -- the descrepancy makes me unsure how to interpret what I read.
2, Are the width & height indexes in bytes or samples? If bytes, how are 8-bit v. 10-bit v. 12-bit 
pixel formats handled at the index generation end?


Thanks!

--
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] MPEG-PS

2020-09-26 Thread Mark Filipak (ffmpeg)

On 09/26/2020 08:42 AM, Peter wrote:

Hi,

I have a file MPEG-PS, AVC264 with audio code G711u.

It does contain absolute timestamp in each frame i.e I can say when exactly each second of this
video was shot.

I want to do the following:

I'm still exploring/learning the structures in program streams, but I'll give you my current notions
and perhaps someone will confirm/refute what I write and we'll both learn, eh?

1. Extract absolute timestamps from the video

Absolute timestamps don't exist in an MPEG-PS. The best that can be determined is frame order and
frame rate from which absolute timestamps can be calculated.

2. Convert the file to MP4 h264 with some more web friendlier audio codec i.e
aac.

You want to remux video and transcode audio.

3. Play the files on web using Nginx and Video.js and be able to search by
absolute timestamp.

I have no knowledge of these.

I am aware that most likely I will need to store absolute timestamp in separate files as I cannot
put them in mp4.

Any help/feedback is welcome :-)

Thanks,

Peter

--
Ignorance can be cured. The cure is called "learning".
Arrogance can be cured. The cure is called "humbling".
Curing arrogant ignorance requires more.
It requires a personality transplant and is best accomplished after the subject has been removed
from public office.

___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] chroma colour

2020-09-24 Thread Mark Filipak (ffmpeg)


On 09/24/2020 09:21 AM, Paul Bourke wrote:

Nice idea, what upscaling methods you use? Do you use swscale by any chance?


Not sure I understand. I don't upscale the movie/image, rather my code
that creates the remap filters just creates the maps 2,3,4 times
bigger than I eventually plan to use.


Hi, Paul. Can you tell me something about "the maps"? Are you referring to an ordinary array of 
cooked picture samples or do you mean something special that I'm unaware of?


Thanks!




And what filters for upscaling and later downscaling 
(bilinear/lanczos/spline/...) ?


Generally just bilinear.

--
Ignorance can be cured. The cure is called "learning".
Arrogance can be cured. The cure is called "humbling".
Curing arrogant ignorance requires more.
It requires a personality transplant and is best accomplished after the subject has been removed 
from public office.

___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] ffplay struggling with this station!


On 09/24/2020 12:52 AM, Edward Park wrote:

Hi,

I just realized that the station was public, so I just tried:
% ffplay -nodisp -vn 
"https://jrtv-live.ercdn.net/jrradio/englishradiovideo.m3u8”
no issues.

I did try without that buffer flag, but that had no effect. I’m going to try 
updating ffplay, and see if that helps.

Yeah also update the tls library before building and if that still doesn’t fix 
it it might be the connection speed?


Also, I’ve noticed that vlc had a 1000 ms “network cache”, and I wonder if that 
had anything to do with playing that station flawlessly.


That probably means something like it’s’ playing 1 second in the past, so if 
something happens and it can’t keep up in realtime there’s still 1 second to 
fix it before it skips.


That's exactly what it means, though I think the word "buffer" would be more 
appropriate.

cache [noun]: a temporary storage space or memory that allows fast access to 
data.

buffer [noun]: a storage device for temporarily holding data until the computer is ready to receive 
or process the data, as when a receiving unit has an operating speed lower than that of the unit 
feeding data to it.

___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question


On 09/23/2020 05:27 PM, Paul B Mahol wrote:

On Wed, Sep 23, 2020 at 04:26:27PM -0400, Mark Filipak (ffmpeg) wrote:

On 09/23/2020 03:53 PM, Carl Eugen Hoyos wrote:

Am Di., 22. Sept. 2020 um 00:47 Uhr schrieb Mark Filipak (ffmpeg)
:


On 09/21/2020 06:07 PM, Carl Eugen Hoyos wrote:

Am Mo., 21. Sept. 2020 um 14:16 Uhr schrieb Mark Filipak (ffmpeg)
:



Here is what you wrote:
"The following makes little sense, it is just meant as an example:
$ ffmpeg -f lavfi -i testsrc2,field -vf bwdif -f null -"

That "explains" nothing. Worse, it seems crass and sarcastic.


No.
This was an example to show you how you can feed one field to
a filter in our system, this is what you had asked for ...


I didn't ask for that.


This is not true:

How can a frame contain just one field?


I did not ask for an example to see "how you can feed one field to a
filter". I asked how a frame can contain just one field. You have yet to
answer that. I think it's impossible. You may be referring to a frame that
is deinterlaced and cut in half (e.g. from 720x576 to 720x288), in which
case the frame contains no field.

You wrote: "(If you provide only one field, no FFmpeg deinterlacer will
produce useful output.)". Of course I agree with the "no...useful output"
part, however, how can a person "provide only one field"? That implies that
"provide only one field" is an option. I think that's impossible, so I asked
you how it was possible. I did not ask how to implement that impossibility
on the command line (which I think is likewise impossible). It is along
these lines that misunderstanding and confusion and novice angst ensues.

Am I nitpicking? I think not. You are an authority. When an authority uses
loose language, misunderstanding and confusion and angst must follow. But
MPEG and ffmpeg seems to be primed to require loose language. That needs to
end.


Try to read and follow separatefields, weave and doubleweave filters 
documentation.


Thank you, Paul. I do try to read them. Is there something specific to which 
you can point?
All inputs are accepted and appreciated. I'm sure we both endeavor to make 
ffmpeg better.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question

On 09/23/2020 03:53 PM, Carl Eugen Hoyos wrote:

Am Di., 22. Sept. 2020 um 00:47 Uhr schrieb Mark Filipak (ffmpeg)
:

On 09/21/2020 06:07 PM, Carl Eugen Hoyos wrote:

Am Mo., 21. Sept. 2020 um 14:16 Uhr schrieb Mark Filipak (ffmpeg)
:

Here is what you wrote:
"The following makes little sense, it is just meant as an example:
$ ffmpeg -f lavfi -i testsrc2,field -vf bwdif -f null -"

That "explains" nothing. Worse, it seems crass and sarcastic.

No.
This was an example to show you how you can feed one field to
a filter in our system, this is what you had asked for ...

I didn't ask for that.

This is not true:

How can a frame contain just one field?

I did not ask for an example to see "how you can feed one field to a filter". I asked how a frame
can contain just one field. You have yet to answer that. I think it's impossible. You may be
referring to a frame that is deinterlaced and cut in half (e.g. from 720x576 to 720x288), in which
case the frame contains no field.

You wrote: "(If you provide only one field, no FFmpeg deinterlacer will produce useful output.)". Of
course I agree with the "no...useful output" part, however, how can a person "provide only one
field"? That implies that "provide only one field" is an option. I think that's impossible, so I
asked you how it was possible. I did not ask how to implement that impossibility on the command line
(which I think is likewise impossible). It is along these lines that misunderstanding and confusion
and novice angst ensues.

Am I nitpicking? I think not. You are an authority. When an authority uses loose language,
misunderstanding and confusion and angst must follow. But MPEG and ffmpeg seems to be primed to
require loose language. That needs to end.

___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question

On 09/23/2020 12:19 PM, Greg Oliver wrote:

On Tue, Sep 22, 2020 at 1:14 PM Mark Filipak (ffmpeg)
wrote:

-snip-

He has repeatedly posted to either understand or define better
the internals of ffmpeg itself...

Thanks for the kind words.

Yaknow, I'm not special or a wizard. I suffer the same assumptions as everyone. As I work on my
glossary, I'm amazed when I realize something that I had wrong, but had worked on steadily for weeks
without actually seeing.

Let me give you an example. Last night I realized no matter whether a stream is frame or TFF
(top_field_first) or BFF (bottom_field_first), that macroblock samples have exactly the same order;
that it's the order that these samples are read out by the decoder that determines whether the 1st
sample goes to line 1 or line 2, and whether the 4 luminance blocks are concurrent (aka "progressive").

In other words, TFF and BFF are not formats. They are access methods!!

That realization caused me to dump a raft of seemingly clever, seemingly insightful diagrams that
had taken weeks of revisions to hone. I realized that those diagrams were crap and just reinforced
concepts that seem reasonable and are universally accepted but that can't survive close scrutiny.

That kind of insight (which makes me think I'm stupid for not seeing it immediately) will be in the
glossary. The existing stuff not only implies that fields exist -- fields do not exist (no such
structure, at least not in streams) and it took me a month of learning how to parse macroblocks to
discover that -- the existing stuff implies that TFF and BFF are differing formats, but they're not
formats at all!

I contend that ordinary users can understand the differences between (hard) structure and (soft)
description, and between a format and a method. I think ordinary users are so hungry to get real
information that they're willing beg and plead and (nearly) debase themselves.

___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question

2020-09-22 Thread Mark Filipak (ffmpeg)

On 09/22/2020 04:20 AM, Edward Park wrote:

Not so, Ted. The following two definitions are from the glossary I'm preparing
(and which cites H.262).

Ah okay I thought that was a bit weird, I assume it was a typo but I saw h.242 and
thought two different types of "frames" were being mixed. Before saying
anything if the side project you mentioned was a layman’s glossary type reference
material, I think you should base it off of the definitions section instead of the
bitstream definitions, just my $.02.

H.242 was indeed a typo ...Oh, wait! Doesn't (H.222+H.262)/2 = H.242?. :-)

I'm not sure what you mean by "definitions section", but I don't believe in "layman's" glossaries. I
believe novices can comprehend structures at a codesmith's level if the structures are precisely
represented. The novices who can't comprehend the structures need to learn. If they don't want to
learn, then they're not really serious. This video processing stuff is for serious people. That
written, what is not reasonable, IMHO, is to expect novices to learn codesmith-jargon and
codesmith-shorthand. English has been around for a long time and it includes everything that is needed.

I would show you some of my mpegps parser documentation and some of my glossary stuff, but 90% of it
is texipix diagrams and/or spreadsheet-style, plaintext tables that are formatted way wider than 70
characters/line, so won't paste into email.

-snip-

Since you capitalize "AVFrames", I assume that you cite a standard of some
sort. I'd very much like to see it. Do you have a link?

This was the main info I was trying to add, it's not a standard of any kind,
quite the opposite, actually, since technically its declaration could be
changed in a single commit, but I don't think that is a common occurrence.
AVFrame is a struct that is used to abstract/implement all frames in the many
different formats ffmpeg handles. it is noted that its size could change as
fields are added to the struct.

There's documentation generated for it here:
https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html

Oh, Thank You! That's going to help me to communicate/discuss with the
developers.

H.262 refers to "frame pictures" and "field pictures" without clearly delineating them. I am
calling them "pictures" and "halfpictures".

I thought ISO 13818-2 was basically the identical standard, and it gives pretty
clear definitions imo, here are some excerpts. (Wall of text coming up…
standards are very wordy by necessity)

--snip 13818-2 excerpts--

To me, that's all descriptions, not definitions. To me, it's vague and ambiguous. To me, it's sort
of nebulous.

Standards don't need to be wordy. The *more* one writes, the greater the chance of mistakes and
ambiguity. Write less, not more.

Novices aren't dumb, they're just ignorant. By your use of "struct" in your reply, I take it that
you're a 'C' codesmith -- I write assy & other HLL & hardware description languages like VHDL &
Verilog, but I've never written 'C'. I've employed 'C' codesmiths, therefore, I'm a bit conversant
with 'C', but just a bit.

What I've noted is that codesmiths generally don't know how to write effective English. Writing well
constructed English is difficult and time consuming at first, as difficult as learning how to
effectively use any syntax that requires knowledge and experience. There are clear rules but most
codesmiths don't know them, especially if English is their native language. They write like they
speak: conversationally. And when others don't understand what's written, rather than revise
smaller, the grammar-challenged revise longer thinking that yet-another-perspective is what's
needed. That produces ambiguity because different perspectives promotes ambiguity. IMHO, there
should solely be just one perspective: structure. Usage is the place for description but that's not
(or shouldn't be) in the domain of a glossary.

So field pictures are decoded fields, and frame pictures are decoded frames?
Not sure if I understand 100% but I think it’s pretty clear, “two field
pictures comprise a coded frame.” IIRC field pictures aren’t decoded into
separate fields because two frames in one packet makes something explode within
FFmpeg

Well, packets are just how transports chop up a stream in order to send it, piecewise, via a
packetized media. They don't matter. I think that, for mpegps, start at 'sequence_header_code' (i.e.
x00 00 01 B3) and proceed from there, through the transport packet headers, throwing out the packet
headers, until encountering the next 'sequence_header_code' or the 'sequence_end_code' (i.e. x00 00
01 B7).

I don't know how frames are passed from the decoder to a filter inside ffmpeg. I don't know whether
the frames are in the form of decoded samples in a macroblock structure or are just a glob of bytes.
Considering the differences between 420 & 422 & 444, I think that the frames passed from the decoder
must have some structure

Re: [FFmpeg-user] bwdif filter question

2020-09-22 Thread Mark Filipak (ffmpeg)


On 09/22/2020 05:59 AM, Nicolas George wrote:

Mark Filipak (ffmpeg) (12020-09-21):

No so, Ted. The following two definitions are from the glossary I'm preparing 
(and which cites H.262).


Quoting yourself does not prove you right.


You are correct. That's why H.262 is in the definition. I'm not quoting myself. 
I'm quoting H.262.



___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question


On 09/21/2020 06:54 PM, Bouke wrote:



On 22 Sep 2020, at 00:44, Mark Filipak (ffmpeg)  wrote:

Paul Mahol accused me


He was not the only one.
Go away!

and no, this is not aimed at you, but to the rest of the bunch to NOT FEED THE 
TROLL


You calling me a troll doesn't make it so.

Anyone following this thread know from which direction the insults come.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question

On 09/21/2020 06:07 PM, Carl Eugen Hoyos wrote:

Am Mo., 21. Sept. 2020 um 14:16 Uhr schrieb Mark Filipak (ffmpeg)
:

On 09/21/2020 03:33 AM, Carl Eugen Hoyos wrote:

Am 21.09.2020 um 01:56 schrieb Mark Filipak (ffmpeg) :

How can it 'deinterlace' a single field?

It can’t and that is what I explained several times in my last two mails.

Here is what you wrote:
"The following makes little sense, it is just meant as an example:
$ ffmpeg -f lavfi -i testsrc2,field -vf bwdif -f null -"

That "explains" nothing. Worse, it seems crass and sarcastic.

No.
This was an example to show you how you can feed one field to
a filter in our system, this is what you had asked for ...

I didn't ask for that. That was in your reply to a comment from Mark Himsley.
"No matter if the raster contains one field, two interlaced fields or a progressive frame, the
filter will always see an input frame."
I simply asked how a deinterlacing filter would handle an input that has only one field. It's a
question that, I note, you have not answered except that it "makes little sense", to which I agreed.

... I used the
filter that is the topic in this mailing list thread.
In addition, I explained - not only but including above - that this
is not a useful example for an interlace filter, just as feeding a
progressive frame is not useful.

I agree in both cases of course.

Please understand that I have shown significantly more patience
with you then with most other users here and significantly more
patience than most people on this mailist list (including the silent
ones) have with you. I can only ask you to accept the answers you
receive instead of interpreting every single one of them as a
personal attack just because they don't match what you expect.

Paul Mahol accused me of attacking you. That's absurd of course. Now you accuse me of feeling
attacked. How would you know what I feel? I don't feel attacked.

You and Paul need to get your stories aligned. :-)

___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question

On 09/21/2020 11:24 AM, Edward Park wrote:

Morning,

Hi Ted!

Regarding 'progressive_frame', ffmpeg has 'interlaced_frame' in lieu of 'progressive_frame'.
I think that 'interlaced_frame' = !'progressive_frame' but I'm not sure. Confirming it as a
fact is a side project that I work on only occasionally. H.242 defines "interlace"
as solely the condition of PAL & NTSC scan-fields (i.e. field period == (1/2)(1/FPS)),
but I don't want to pursue that further because I don't want to be perceived as a troll. :-)

I'm not entirely aware of what is being discussed, but progressive_frame =
!interlaced_frame kind of sent me back a bit, I do remember the discrepancy you
noted in some telecopied material, so I'll just quickly paraphrase from what we
looked into before, hopefully it'll be relevant.

The AVFrame interlaced_frame flag isn't completely unrelated to mpeg
progressive_frame, but it's not a simple inverse either, very
context-dependent. With mpeg video, it seems it is an interlaced_frame if it is
not progressive_frame ...

No so, Ted. The following two definitions are from the glossary I'm preparing
(and which cites H.262).

'progressive_frame' [noun]: 1, A metadata bit differentiating a picture or
halfpicture
frame ('1') from a scan frame ('0'). 2, H.262 §6.3.10: "If progressive_frame
is set
to 0 it indicates that the two fields of the frame are interlaced fields in
which an
interval of time of the field period exists between (corresponding spatial
samples)
of the two fields. ... If progressive_frame is set to 1 it indicates that the
two
fields (of the frame) are actually from the same time instant as one another."

interlace [noun]: 1, H.262 §3.74: "The property of conventional television
frames [1]
where alternating lines of the frame represent different instances in time."
[1] H.262 clearly limits interlace to scan-fields and excludes concurrent
fields
(and also the non-concurrent fields that can result from hard telecine).
2, Informal: The condition in which the samples of odd and even rows (or
lines)
alternate.
[verb], informal: To weave or reweave fields.

-- A note about my glossary: "picture frame", "halfpicture frame", and "scan frame" are precisely
and unambiguously defined by (and differentiated from one another by) their physical structures
(including any metadata that may demarcate them), not by their association to other features and not
by the context in which they appear. I endeavor to make all definitions strong in likewise manner.

... and it shouldn't result where mpeg progressive_sequence is set.

Basically, the best you can generalize from that is the frame stores interlaced
video. (Yes interlaced_frame means the frame has interlaced material) Doesn't
help at all... But I don't think it can be helped? Since AVFrames accommodates
many more types of video frame data than just the generations of mpeg coded.

Since you capitalize "AVFrames", I assume that you cite a standard of some sort. I'd very much like
to see it. Do you have a link?

I think it was often said (not as much anymore) that "FFmpeg doesn't output fields" and I
think at least part of the reason is this. At the visually essential level, there is the
"picture" described as a single instance of a sequence of frames/fields/lines or what
have you depending on the format and technology; the image that you actually see.

H.262 refers to "frame pictures" and "field pictures" without clearly delineating them. I am calling
them "pictures" and "halfpictures".

But that's a visual projection of the decoded and rendered video, or if you're
encoding, it's what you want to see when you decode and render your encoding. I
think the term itself has a very abstract(?) nuance. The picture seen at a
certain presentation timestamp either has been decoded, or can be encoded as
frame pictures or field pictures.

You see. You are using the H.262 nomenclature. That's fine, and I'm considering using it also even
though it appears to be excessively wordy. Basically, I prefer "pictures" for interlaced content and
"halfpictures" for deinterlaced content unweaved from a picture.

Both are stored in "frames", a red herring in the terminology imo ...

Actually, it is frames that exist. Fields don't exist as discrete, unitary structures in macroblocks
in streams.

... The AVFrame that ffmpeg deals with isn't necessarily a "frame" as in a rectangular
picture frame with width and height, but closer to how the data is temporally "framed,"
e.g. in packets with header data, where one AVFrame has one video frame (picture). Image data could
be scanned by macroblock, unless you are playing actual videotape.

You singing a sweet song, Ted. Frames actually do exist in streams and are denoted by metadata. The
data inside slices inside macroblocks I am calling framesets. I firmly believe that every structure
should have a unique name.

So when interlace scanned fields are stored in

Re: [FFmpeg-user] bwdif filter question


On 09/21/2020 09:26 AM, Paul B Mahol wrote:

On Mon, Sep 21, 2020 at 08:11:59AM -0400, Mark Filipak (ffmpeg) wrote:

On 09/21/2020 03:33 AM, Carl Eugen Hoyos wrote:




Am 21.09.2020 um 01:56 schrieb Mark Filipak (ffmpeg) :

How can it 'deinterlace' a single field?


It can’t and that is what I explained several times in my last two mails.


Here is what you wrote:
"The following makes little sense, it is just meant as an example:
$ ffmpeg -f lavfi -i testsrc2,field -vf bwdif -f null -"

That "explains" nothing. Worse, it seems crass and sarcastic. The perfect
word is "snarky". Do you know that word? It's a word invented by the man who
wrote "Alice In Wonderland". Sometimes it seems that what you write is meant
to pull people down a psychedelic rabbit hole and into a fantasy world.

Just because something is possible with ffmpeg, if it doesn't make sense to
do it, don't mention it. If you do mention it and you write that it makes
"little sense", then explain why it makes little sense.

In this case, it doesn't make "little sense". It makes *no* sense.


Please refrain from attacking other people on this mailing list.


I am not attacking Carl Eugen. I'm trying to help him.
___
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question