you please take a look performace with LD1R?.
Regards,
Chen
At 2024-05-28 18:03:43, "Hari Limaye" wrote:
>Hi Chen,
>
>Thank you for reviewing the patches.
>
>>In this case, replace LD1 by LDR+ADD is not get benefit
>
>Here, the existing instruction `ld1 {v0.s
Hi Hari,
These 8 patches looks good, the only comment on below code
=
.macro SAD_START_4 f
-ld1 {v0.s}[0], [x0], x1
+ldr s0, [x0]
+ldr s1, [x2]
+add x0, x0, x1
+add x2, x2, x3
Hi Hari,
The new patches looks good for me now, thank you for your patches.
Regards,
Chen
At 2024-05-23 03:09:26, "Hari Limaye" wrote:
>Hi Chen,
>
>Thank you for reviewing the patches.
>
>>In signOf_neon
>>>+ // signOf(a - b) = -(a > b) |
e it is similar algorithm as Neon
Regards,
Chen
At 2024-05-21 00:14:35, "Hari Limaye" wrote:
>Hi,
>
>This patch-series adds AArch64 Neon, SVE, and SVE2 implementations of
>the saoCuStats function primitives for low and high bitdepth.
>
>This series is based on the pre
Hi Hari Limaye,
Thank you fix AARCH64 build issues, these 12 patches looks good for me.
Regards,
Chen
At 2024-05-03 05:19:36, "Hari Limaye" wrote:
>The assembly routine x265_costCoeffNxN_neon is buggy and produces an
>incorrect result on Apple Silicon, causing the
Hello,
Could you please try my local patch?
Regards,
Min Chen
2023-05-21 17:27:35,"Mario *LigH* Rohkrämer"
>Almost 3 years later, NASM version 2.16.01, and still no solution,
>nobody is responsible for "just" warnings.
>
>--
>
>Fun and success!
Hi,
I haven't OS X environment, so I just guess the reason.
The GCC and LLVM use different symbol prefix.
We use "[private_prefix %+ _entropyStateBits]" in the x86 assembly code to suit
these changes.
But use as "movrel x1, x265_entropyStateBits" in aarch64, I don't found these
little
verify patch?
Regards,
Min Chen
diff --git a/source/CMakeLists.txt b/source/CMakeLists.txt
index 13e4750de..80f8e59a9 100755
--- a/source/CMakeLists.txt
+++ b/source/CMakeLists.txt
@@ -266,6 +266,9 @@ if(GCC)
add_definitions(-DHAVE_NEON)
endif()
endif
Hi Song,
I means the current tree support these adrp+add mode with compile option -DPIC,
so we need not patch the code.
Regards,
Min Chen
2022-10-06 06:42:33,"Fangrui Song"
Hi Min, sorry but I just saw your question. I do not understand the request.
adrp+add is jus
'-fPIC -DPIC'
Regards, Min Chen
At 2022-09-24 15:21:35, "Fangrui Song" wrote:
>Ping. The breaks lld build and some binutils configurating defaulting
>to disallow text relocations.
>
>On 2022-08-29, Fangrui Song wrote:
>>On 2022-08-29, Fangrui Song wrote:
>>>O
suggest keep current code, so the user may configuration by themselves.
btw: the ENABLE_PIC looks just work with GCC, so I think we need take a look
these option on Apple platform.
Regards,
Min Chen
At 2022-08-30 13:42:39, "Fangrui Song" wrote:
>On 2022-08-29, Fangrui Song wrote:
Hi Song,
Thank you for your patch.
However, syntax of ':lo12:' depends on compiler, so more general LDR is better
in here.
Regards,
Min Chen
At 2022-08-30 02:33:37, "Fangrui Song" wrote:
>The ldr pseudo-instruction uses a literal pool, which is less efficient
>and d
-#if X86_64
+#if X86_64 || defined(__aarch64__)
[MC] This is right, but for more generic, we can check with sizeof(long*)==8
Other are fine.
Regards,
Min Chen
2022-03-25 00:24:01,"Pop, Sebastian"
Hi,
Please find attached a few more changes that bring up the p
sig, If we allow
the data in absCoeff to be stored sparsely, we can get parallel processing all
of 16 elements.
Regards,
Min Chen
At 2022-03-05 04:24:09, "Pop, Sebastian" wrote:
Thanks Min Chen for your feedback.
Please see attached a patch that avoids one transfer from N
Hi Roger,
Both version number looks right.
The branch stable is 1 commit ahead of Tag 3.5, and branch master ahead more.
So the version number is 3.5+1 and 3.5+34, the other part is git hash
Regards,
Min Chen
At 2022-03-11 12:41:44, "Roger Pack" wrote:
>Hello.
>As a note if
algorithm problems, for
example, we spends many instructions for absCoeff[numNonZero], if we allow
spare zeros inside of array, we will reduce many of instructions.
Regards,
Min Chen
At 2022-03-02 07:28:15, "Pop, Sebastian" wrote:
Hi,
the attached patch fixes the registra
Hi Sebastian,
Thank your contribute, I haven't more comments now.
Regards,
Min Chen
2022-01-29 02:35:24,"Pop, Sebastian"
Hi,
> [MC] how about CMHI with a vector register that hold zeros?
This works wonderfully, thanks for the suggestion!
Perform
Hi Sebastian,
Thank you for your explain more, I inline my comments.
At 2022-01-28 10:08:36, "Pop, Sebastian" wrote:
Hi Min Chen,
Thank you for your review comments, that helped improve the performance of
scanPosLast on arm64:
scanPosLast 5.46x
Hi Sebastian,
Thank you for your contribution, I reviewed and made some of comments, could
you please take a look.
Regards,
Min Chen
At 2022-01-19 23:25:30, "Pop, Sebastian" wrote:
Hi Gopi,
Please find attached a patch that ports scanPosLast to arm64 NEON.
s
Hi Nathan,
Ah, I have same question in couple years ago.
The root cause lies in the type of intermediate variable, the sign bit will
affacf high part of combine varible.
so if you change sum_t to sum2_t on variable A & B, you will get correct result.
Regards,
Min Chen
At 2021-11-25 1
Hi,
stdint.h is C/C++ standard header files.
sys/time.h is depends on OS, but I guess it is not necessary during runing, it
mostly use by collection performance data.
memory.h may be ignore if you declare memory manager functions in other
headers, such as malloc()
Regards,
Min Chen
At 2021
Hi,
Code looks good.
The only comment is UADALP is slower, we can adjust order of sum to avoid it.
Regards,
Min Chen
2021-08-07 02:01:13,"Pop, Sebastian"
Hi,
the attached patch ports to arm64 the following kernel:
ssim_4x4x2_core 30.69x 13.39 410.85
Ok
,
Min Chen
At 2021-07-31 12:14:29, "Pop, Sebastian" wrote:
Hi,
Please let me know if you have ideas on how to make this code faster.
I tried to remove the stall by fetching more memory earlier, still no change in
performance:
// void scale2D_64to32(pixel* dst, const pixel* src
Hi,
The code looks good.
little performance change because pipeline stall, two of LD1 can't hidden
latency penalty, but it is not big problem, we saved the code size.
Could you please make a stalone patch, I guess patch to patch is not good idea.
Regards,
Min Chen
At 2021-07-31 02:27:36
to LD1+ADDLP
btw: excuse me, other patches need more time, probability review on weekend.
Regards,
Min Chen
2021-07-30 06:13:34,"Pop, Sebastian"
Hi,
the attached patch ports to arm64 the following kernels:
scale1D_128to64 68.89x 12.06
Hi Sebastian,
Looks good now, thanks.
Regards,
Min chen
At 2021-07-27 23:50:19, "Pop, Sebastian" wrote:
Thanks Min Chen for your reviews.
In the attached patch I used dup instead of memory load, and I rescheduled some
of the instructions to avoid pipel
Hi,
I just a little comments.
+.macro addAvg_start
+lsl x3, x3, #1
+lsl x4, x4, #1
+movrel x11, addAvg_offset
+ld1 {v30.8h}, [x11]
All of value in the addAvg_offset is 0x40, why not DUP?
+add v0.8h, v0.8h,
Looks good, thanks.
2021-07-27 02:53:10,"Pop, Sebastian"
Hi,
the attached patch ports to arm64 the following kernels:
cpy2Dto1D_shl[4x4] 15.69x 6.73105.60
cpy2Dto1D_shr[4x4] 12.97x 6.6586.28
cpy2Dto1D_shl[8x8] 43.32x 8.85383.16
the
W12 in the range [-16,0]
Please also remind the W0 is low part of X0, and result in the reg S4 is int32.
Others in the patch looks good.
Regards,
Min Chen
At 2021-07-25 13:31:06, "Pop, Sebastian" wrote:
Hi,
> You didn't see improve because you still use USHR, after C
Hi,
That's my fault, I lost these part of SAD, so your code is no problem now,
thank you.
Regards,
Min Chen
At 2021-07-24 03:54:46, "Pop, Sebastian" wrote:
Hi Min Chen,
thanks for your reviews.
> +.macro SAD_X_END_64 x
> +uaddlp v16.4s, v16.8h
> The
Looks good
2021-07-24 07:04:03,"Pop, Sebastian"
Hi,
the attached patch ports to arm64 the following kernels:
avg_pp[ 4x4] 8.50x8.8575.21
avg_pp_aligned[ 4x4] 8.49x8.8975.46
avg_pp[ 8x8] 29.12x 11.61 338.01
At 2021-07-24 05:23:44, "Pop, Sebastian" wrote:
Hi,
> +fmovw12, s4
> +neg w12, w12
> +add w0, w12, #16
> (-w12) + 16 equal to 16-w12, load #16 into w0 may execution parallelism with
> FMOV.
I see a small improvement with this change.
[x6], #4
+st1 {v17.s}[0], [x6], #4
+st1 {v18.s}[0], [x6], #4
I guess STP may store two result in a cycle
Regards,
Min Chen
2021-07-22 14:30:50,"Pop, Sebastian"
Hi,
the attached patch ports to arm64 the following kernels:
sad_x3[
.
Regards,
Min Chn
At 2021-07-20 12:45:03, "Pop, Sebastian" wrote:
Thanks Min Chen for your reviews.
I tried your suggestion to remove one of the FP->GPR transfers.
With the following patch I do not see any improvement for the 64x routines, and
the number of instructions rem
,v16
Regards,
Min Chen
2021-07-17 04:44:05,"Pop, Sebastian"
Hi,
the attached patch ports to arm64 the following kernels:
sad[ 4x4] 10.11x 6.5065.72
sad[ 8x8] 28.95x 8.50246.00
sad[ 8x4] 23.03x 5.45
Hi Sebastian,
It looks good, thanks.
Regards,
Min Chen
At 2021-07-08 02:20:01, "Pop, Sebastian" wrote:
Attached the amended patch with movi.
That improved performance, thanks!
I have seen the cmp/br pattern several times.
We can do the reordering tuning after all the i
odata.
Please see the attached patch.
Sebastian
From: x265-devel on behalf of chen
Reply-To: Development for x265
Date: Friday, July 2, 2021 at 8:11 PM
To: Development for x265
Subject: RE: [EXTERNAL] [x265] [arm64] port LUMA_VPP_4xN
|
CAUTION: This email originate
Hi,
I put my comments inline. thanks.
btw: I found more improve on this patch.
+eor v17.16b, v17.16b, v17.16b
The clear register operator may replace by MOVI
At 2021-07-03 02:43:07, "Pop, Sebastian" wrote:
Hi,
thanks for your review.
> +#ifdef __MACH__
> +# define
Hello,
Thank your patch, I make some comments.
+#ifdef __MACH__
+# define MACH
+#else
+# define MACH #
This is not good idea to bypass .const_data
+ld1 {v0.s}[0], [x0], x1
+ld1 {v0.s}[1], [x0], x1
+ushll v0.8h, v0.8b, #0
...
+//
I have not comment on this patch, thanks.
2021-06-25 01:45:03,"Pop, Sebastian"
Added one missing function:
convert_p2s[48x64] 1.56x300.44 469.25
___
x265-devel mailing list
x265-devel@videolan.org
The patch looks good, no more modify necessary, thanks.
btw: you didn't see change with CBNZ, I guess two reasons, one is 'sub x9' too
is in first part of loop, I more likely move these independent instruction
fill into pipeline stall slots, the second is count of loop is not many enough
it looks good for me, thanks.
btw: ARM64 have new instruction CBZ / CBNZ.
At 2021-06-24 10:11:32, "Pop, Sebastian" wrote:
I added the following change in the attached patch.
It has better performance with ldp as it allows to re-schedule the instructions
in independent ways:
function
You are welcome.
on your CPU, the ldp still slower, so we can keep origin version and improve it
again in future.
This version looks good for me, thank you for your contribute.
At 2021-06-24 10:01:40, "Pop, Sebastian" wrote:
Thanks again Chen for your careful review and recom
Could you please also try comments in last email? thanks.
At 2021-06-24 09:09:09, "Pop, Sebastian" wrote:
> +.macro filterPixelToShort_64xN h
> +function x265_filterPixelToShort_64x\h\()_neon
> +add x3, x3, x3
> +sub x3, x3, #0x40
> +movi
Thank your response, comment inline.
At 2021-06-24 08:57:20, "Pop, Sebastian" wrote:
Hi Chen,
Thanks for your review!
> +function x265_filterPixelToShort_4x4_neon
> +add x3, x3, x3
> +moviv2.8h, #0xe0, lsl #8
> are you compiler do
Hi Sebastian,
thanks your patch.
I have some comments.
+function x265_filterPixelToShort_4x4_neon
+add x3, x3, x3
+moviv2.8h, #0xe0, lsl #8
are you compiler does not handle constant 0xe000 automatic? it is more readable
+ld1 {v0.s}[0],
Hello,
Sorry for delay.
I had been fix these warnings with new version nasm in my local tree, but I
don't
know how to merge it into the current x265 tree, please wait the x265 team to
fix these issues.
Regards,
Min Chen
At 2021-04-09 15:48:37, "Mario *LigH* Rohkrämer" wrote:
>
Regards,
Min Chen
At 2020-09-03 23:56:13, "Nomis101" wrote:
>Am 03.09.20 um 15:28 schrieb Mario *LigH* Rohkrämer:
>> In the meantime, MSYS2 provides NASM 2.15.04; same output.
>>
>
>I had a patch for this in this list. Maybe you could try if this patch will
Hi Gopi, thank you help review these patches.
At 2020-08-27 00:42:11, "Gopi Satykrishna Akisetty"
wrote:
Hi Min,
On Thu, Aug 20, 2020 at 7:48 AM chen wrote:
Hi Damiano,
Thank your information.
I fast take a look, it is based on Intrinsic, the perforamance strong depends
o
leave that start at September
2020.
Regards,
Min Chen
At 2020-08-20 00:01:52, "Damiano Galassi" wrote:
>Hi, Apple contributed to the HandBrake project a x265 patch
>with a bunch of neon asm to improve x265 performance on Apple’s upcoming ARM
>Macs,
>but I don’t have the
Hi Xiyuan,
I have been forwarded the email to you directly.
Regards,
Min Chen
2020-03-18 09:38:18,"Xiyuan Wang"
Hi chen
we didn't receive your reply about Part-1, can you resend it? Maybe the
content is too large and the mail list blocked it. You can just quote the code
a/source/common/pixel.cpp b/source/common/pixel.cpp
index 99b84449c..e4f890cd5 100644
--- a/source/common/pixel.cpp
+++ b/source/common/pixel.cpp
@@ -5,6 +5,7 @@
* Mandar Gurav
* Mahesh Pittala
* Min Chen
+ * Hongbin Liu
*
* This program is free software
At 2020-02-27 16:59:18, "Niranjan Bala" wrote:
+double computeBrightnessIntensity(pixel *inPlane, int width, int height,
intptr_t stride)
+{
+pixel* rowStart = inPlane;
restrict with const prefix may better.
+double count = 0;
why declare as Double?
+
+for (int i = 0; i <
From 7e495390396d6a55f95ad4649e46b56fd7d2ef1c Mon Sep 17 00:00:00 2001
From: Min Chen
Date: Mon, 4 Nov 2019 16:21:20 +0800
Subject: [PATCH] Improve all_angs_pred_c by remove unnecessary transpose
---
source/common/intrapred.cpp | 22 +++---
1 file changed, 3 insertions(+), 19
At 2019-09-23 12:50:22, "Akil" wrote:
# HG changeset patch
# User Akil Ayyappan
# Date 1568370446 -19800
# Fri Sep 13 15:57:26 2019 +0530
# Node ID 531f6b03eed0a40a38d3589dec03f14743293146
# Parent c4b098f973e6b0ee4aee3bf0d7b54da4e2734d42
Adaptive Frame duplication
+uint32_t y = 0;
+
Could you please try #include ?
At 2019-09-09 10:18:13, "qw" wrote:
Hi,
The latest vmaf source code is used, but I still fail to build x265. Below is
the error message:
Scanning dependencies of target common
[ 1%] Building ASM_NASM object common/CMakeFiles/common.dir/x86/pixel-a.asm.o
[
ead.
At 2019-07-15 13:58:53, "Akil" wrote:
Thanks for your suggestions, Chen. Have added the matrix in comments. That
should make the code more readable. Regarding the last point, I think
(rowNum+X)*stride cannot be replaced by a constant since it tends to change
every time.
On Fri, Jul 12,
On Wed, Jul 10, 2019 at 3:41 PM Akil wrote:
# HG changeset patch
# User Akil Ayyappan
# Date 1561035091 -19800
# Thu Jun 20 18:21:31 2019 +0530
# Node ID d25c33cc2b748401c5e908af445a0a110e26c3cf
# Parent 4f6dde51a5db4f9229bddb60db176f16ac98f505
AQ: New AQ mode with Variance and Edge
The debug info affect compiler code generate, so lost a few performance, but we
can ignore them since it is not much.
Regards,
Min
At 2019-05-18 22:09:06, "qw" wrote:
If I want to build x265 with release and ,debug info, I will choose the option
of CMAKE_BUILD_TYPE=RelWithDebInfo.
Is
Hi,
Could you please try it with multilib.bat
It is Steve's idea, we build lib two times with different bit_depth and combine
these libs into one multiple feature lib.
Regards,
Min Chen
At 2019-05-17 11:06:13, "qw" wrote:
hi,
I read x265 source code, and find one function, as s
Just say it works.
First at all,
The expect algorithm is square of (x >> shift)
It is 8 bits (I assume we talk with 8bpp, the 16bpp are similar) multiple of
8-bits and result is 16 bits.
The function works on CU-level, the blockSize is up to 64 only, or call 6-bits.
So, we can decide the
Hi, I would like to configure the sad function in COST_MV for another
platform. However, the assembly code would not be supported on the other
platform. Where can I find the original programming language code that was
made into the assembly language code?
This patch remove unnecessary pow() and abs()
0001-improve-pow-x-2.patch
Description: Binary data
___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel
I found that p.chroma[X265_CSP_I420].pu[i].p2s was not initialize on ARM
platform, all of them execute as C-model, I guess these functions may reuse
NEON's convert_p2s[*]
___
x265-devel mailing list
x265-devel@videolan.org
Please ignore my previous email, the dcVal initialize value is width, so this
module have not bug. Sorry for disturb.___
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel
Hi,
There have a long time bug in our intra prediction DC mode, see details:
HM
// Function for calculating DC value of the reference samples used in Intra
prediction
//NOTE: Bit-Limit - 25-bit source
Pel
I found some issues in ARM code, I don't point out on time, that's my failure.
Such as these garbage code in x265_pixel_add_ps_4x4_neon:
vmov.u16q10, #255
veor.u16q11, q11
veor.u16d3, d3
veor.u16d5, d5
btw: the ARM build was broken after
There have series performance issues, such as,
uint32_t sum = (uint32_t)pow((outOfBound >> 2), 2);
Are you want to get square value from a small integer?
___
x265-devel mailing list
x265-devel@videolan.org
Sorry, I miss a line, resend with addition comment
At 2018-04-07 01:27:34, "chen" <chenm...@163.com> wrote:
At 2018-04-06 21:17:37, mythr...@multicorewareinc.com wrote:
># HG changeset patch
># User Jayashree
># Date 1517283539 28800
># Mon Jan 29 19
At 2018-04-06 21:17:37, mythr...@multicorewareinc.com wrote:
># HG changeset patch
># User Jayashree
># Date 1517283539 28800
># Mon Jan 29 19:38:59 2018 -0800
># Node ID 3c6e5ce07dbca7f967e4b5b62fe450979da3bf81
># Parent 624c83571d1df840e1206c46e589044fbf87ff32
>x86: AVX512
Hi,
Thank you report this bug.
I think the root cause is not sizeof(), the negative stride is invalid in
encoder/decoder core.
To avoid these invalid input parameters, the x264 insert a middle-layer that
convert color space and images, but x265 doesn't it.
Of course, crash is worst way to
At 2018-01-11 00:06:29, "Andrey Semashev" <andrey.semas...@gmail.com> wrote:
>On 01/10/18 18:53, chen wrote:
>> Hi Andrey,
>>
>> Our code rule prohibit inline assembly, especially the patch used GCC
>> extension syntax.
>
>Ok, I
Hi Andrey,
Our code rule prohibit inline assembly, especially the patch used GCC extension
syntax.
the "lock" prefix will lock the CPU bus, it will be greater penalty on the
multi-core system.
Thanks,
Min
At 2018-01-10 23:30:06, "Andrey Semashev" wrote:
>Any
SSSE3 pmulhrsw also improve pmullw+paddw+psraw
At 2017-11-28 23:57:50, "Ximing Cheng" wrote:
># HG changeset patch
># User Ximing Cheng
># Date 1511862059 -28800
># Tue Nov 28 17:40:59 2017 +0800
># Node ID
I have a few comments.
At 2017-11-28 23:57:50, "Ximing Cheng" wrote:
>diff -r b24454f3ff6d -r 9cd0cf6e2fd8 source/common/x86/const-a.asm
>--- a/source/common/x86/const-a.asmWed Nov 22 22:00:48 2017 +0530
>+++ b/source/common/x86/const-a.asmTue Nov 28 17:40:59
>diff -r a7c2f80c18af -r 973560d58dfb source/common/x86/intrapred8.asm
>--- a/source/common/x86/intrapred8.asm Mon Nov 20 14:31:22 2017 +0530
>+++ b/source/common/x86/intrapred8.asm Tue Nov 21 03:10:14 2017 +0800
>@@ -22313,11 +22313,144 @@
> mov [r1 + 64], r3b ;
From 360c25c6198e7aaa3a9f0ad611d99f94a1ea6347 Mon Sep 17 00:00:00 2001
From: Min Chen <chenm...@163.com>
Date: Wed, 28 Jun 2017 11:54:05 -0500
Subject: [PATCH] fix build error on VS2008 ( ambiguous on pow() )
---
source/encoder/slicetype.cpp |3 ++-
1 files changed, 2 insertions
Hi Guillaume,
Our development platform is Visual Studio, the compiler can't auto-vectorize.
We also can't assume user have advanced compiler on their computer.
Regards,
Min
At 2017-05-08 19:36:24,"Guillaume POIRIER" wrote:
>Hello Praveen Tiwari,
>
>Just for curiosity,
Good morning Michael,
I made a restrict on count of slices because we have limited number of output
NAL buffers.
Every slices need a independent NAL, but the SPS/PPS/VPS will also allocate at
least one of NAL, so I made slices limit to (MAX_NAL_UNITS - 1)
Best regards,
Min
At 2017-03-14
# HG changeset patch
# User Min Chen <chenm...@163.com>
# Date 1479924604 21600
# Node ID c5ea19f5852aadd42bedd1d9fe4eb4b350a31e73
# Parent a895b6344a82f2b5a0f8bc4ba7a913f0c40d114d
fix logic timing bug
---
source/encoder/framefilter.cpp | 11 ---
1 files changed, 8 insertions
# HG changeset patch
# User Min Chen <min.c...@multicorewareinc.com>
# Date 1479317016 21600
# Node ID 99a4a2d29d5c2b997745b06e5954a03bc080478f
# Parent 4c1652f3884fba9fab4c589dd057b12e6bf33d5b
cleanup debug code
---
source/encoder/sao.cpp |4 +---
1 files changed, 1 insertions
# HG changeset patch
# User Min Chen <min.c...@multicorewareinc.com>
# Date 1478030336 18000
# Node ID 201758801366fb5e5b59710d87f4b8da911d6b73
# Parent 5fe7ac3068ebedc3d58451518c54c501e3c41103
[slices] restrict mv never beyond boundary in both slices and non-slices mode
---
source/e
2016-11-01 11:40:45,"Pradeep Ramachandran" <prad...@multicorewareinc.com> :
On Mon, Oct 31, 2016 at 11:03 PM, chen <chenm...@163.com> wrote:
# HG changeset patch
# User Min Chen <min.c...@multicorewareinc.com>
# Date 1477935084 18000
# Node ID 9be03f087899
# HG changeset patch
# User Min Chen <min.c...@multicorewareinc.com>
# Date 1477935084 18000
# Node ID 9be03f08789954f772a50f26485a9c96ca745497
# Parent b08109b3701e9b86010c5a5ed0ad7b3d6a051911
[slices] fix multi-slices output non-determination bug
---
source/common/common.h
From e697fcd5fa0d36b33d42d01c2845ca36533dbd96 Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Thu, 27 Oct 2016 11:11:09 -0500
Subject: [PATCH] [slices] allow number of slices more than rows (Issue #300-3)
---
source/common/param.cpp|2 --
source/e
All of his origin files in another patch, that is very large and mail-list
block it until you approval.
At 2016-10-25 11:59:45,"Pradeep Ramachandran" <prad...@multicorewareinc.com>
wrote:
On Tue, Oct 25, 2016 at 2:59 AM, chen <chenm...@16
From 1bea85513646e4d9d992bbe326a9cb3275ec313a Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Mon, 24 Oct 2016 16:38:55 -0500
Subject: [PATCH] [PPC] GPL v2 copyright header
---
source/common/ppc/dct_altivec.cpp | 24
source/
From d23527c6204921b782ef8bc2f1a69de88163202a Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Mon, 24 Oct 2016 16:27:35 -0500
Subject: [PATCH] [PPC] support option --no-asm to disable Altivec
---
source/CMakeLists.txt|2 +-
source/common/c
Thank you help reply that message.
I am the developer for WPP and Slices, the motion vectors has restricted in
slice boundary now, I will also make same restricts on Tiles. In future, we
will addition a new user option to allow MV beyond boundary.
Paid attention, it is a low priority task in
+0800 (CST)
From: chen <chenm...@163.com>
To: "Development for x265" <x265-devel@videolan.org>
Subject: Re: [x265] Optimize slice QP in PPS for x265
Message-ID: <2e196b48.110e.1576c622432.coremail.chenm...@163.com>
Content-Type: text/plain; charset="gbk"
Hello Xuefeng,
Your idea is good, in low bitrate environment, the MV, header are most
important part in bitstream.
I take a look your code, it sounds some problems.
Your calculate correlation between sliceQp and QP Range (it is [0, 51] without
range extension), so you will got a constant
This patch made logic bug, the m_reconRowFlag and numRowFinished use to enable
Sao filter when all row finished.
At 2016-09-27 19:17:16,as...@multicorewareinc.com wrote:
># HG changeset patch
># User Ashok Kumar Mishra
># Date 1474974965 -19800
># Tue Sep 27
nd binary both are
different. I applied you patch build once (like 8 bit build) and collected all
depth outputs (8, 10 and 12), compared with three builds of x265 i.e 8 bit, 10
bit and 12 bit.
Regards,
Praveen
On Fri, Sep 23, 2016 at 2:47 AM, chen <chenm...@163.com> wrote:
Hi Prave
depth outputs (8, 10 and 12), compared with three builds of x265 i.e 8 bit, 10
bit and 12 bit.
Regards,
Praveen
On Fri, Sep 23, 2016 at 2:47 AM, chen <chenm...@163.com> wrote:
Hi Praveen,
I test your cmdlind on my VS2008 build.
I build three bit-depth version and compare with
On Thu, Sep 15, 2016 at 1:55 AM, chen <chenm...@163.com> wrote:
From ea50e494473623ed0dbff2907194aaf268dc449a Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Wed, 14 Sep 2016 15:23:38 -0500
Subject: [PATCH] [multi-lib] Support 8+10+12 bits in single DLL (Wo
From ea50e494473623ed0dbff2907194aaf268dc449a Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Wed, 14 Sep 2016 15:23:38 -0500
Subject: [PATCH] [multi-lib] Support 8+10+12 bits in single DLL (Workaround)
---
source/CMakeLists.txt
From ea93a3ddb7e8c7e106955acef56f6df72a15587a Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Tue, 13 Sep 2016 10:59:09 -0500
Subject: [PATCH] [slice] fix help information defaule value mistake
---
source/x265cli.h |2 +-
1 files changed, 1 insertions
Thank you point out my fault, I forgot to check default value field, I was
fixed this bug now.
At 2016-09-13 15:12:33,"Mario *LigH* Rohkrämer" <cont...@ligh.de> wrote:
>Am 07.09.2016, 22:27 Uhr, schrieb Min Chen <chenm...@163.com>:
>
>> +H0(" --[n
From dc6d861fd8f91c90e6bbdee366cfb7df5fdf183f Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Mon, 12 Sep 2016 13:18:32 -0500
Subject: [PATCH] [slice] verify untest path and enable it
---
source/encoder/frameencoder.cpp |2 +-
1 files changed, 1 inse
From e409325885d196b53d9824ee861867a696e6df51 Mon Sep 17 00:00:00 2001
From: Min Chen <min.c...@multicorewareinc.com>
Date: Wed, 7 Sep 2016 15:25:49 -0500
Subject: [PATCH] [slice] cleanup debug code
---
source/encoder/frameencoder.cpp | 10 +-
source/encoder/framefilter.cpp
1 - 100 of 1234 matches
Mail list logo